“Stop Acting Like a Baby”

Side note on the title: The title is inspired by the various dad moments I’ve had in the last few weeks where I literally caught myself saying those exact words to my daughter who is going to be 10 months soon; shortly after saying them, I laughed, and my daughter kept innocently staring at me.

The dataset I used I originally found through Data School1. which lead me to the Advanced High School Statistics book on OpenIntro’s2. website. The data set source derived from their site but the dataset originated from “Season of birth and onset of locomotion” by J. B. Benson3.

Here is a briefing on the findings of the study…

What did I find?

The Dataset…

This snapshot was taken from R. This dataset is the same as the csv file except the columns Ctemp (Average Temperature in Celsius) and avg_crawling_age_months (Average Crawling Age in Months). You can see the code for adding these columns along with all of my other code here: https://github.com/sterlingn/babycrawl_data/tree/292ddeef79dc76c3d1047dbe37ff3c57bea78f33.

According to the summary of the dataset, found on https://www.sciencedirect.com/science/article/abs/pii/0163638393800298, there were 425 infants in the study, however, our dataset shows that there are only 414 infants. Looking at the correlation between temperature and babies crawling, our r=-.70, see the graphic below:

And so, there is a negative correlation, but this doesn’t imply causation. Given that we only have one variable to work with, it is harder to know exactly what causes babies to start crawling later than other babies, so how good of a predictor is the temperature for the crawling age? Well, let’s take a look in R.

After running the summary statistic on the simple linear model that was created, the results show that the temperature, as a coefficient, has a P value significantly less than the 0.05 threshold, which results in the rejection of the null hypothesis that there is no correlation between the two coefficients, temperature and average crawling age, however the adjusted R-squared value being at .4386 out of 1 shows that this model is not the best. There appears to be other variables that are influencing the average baby crawling age, however, temperature does seem to be an influencer. Another downfall to using the dataset I have is that the temperature and the crawling data are both averages taken from the actual data which can seriously cause issues when trying to analyze the data, because the data is already a summary of the actual data.

References:

  1. https://www.dataschool.io/resources/
  2. https://www.openintro.org/stat/textbook.php
  3. J.B. Benson. Season of birth and onset of locomotion: Theoretical and methodological implications. In: Infant behavior and development 16.1 (1993), pp. 69-81. issn: 0163-6383.
  4. https://www.sciencedirect.com/science/article/abs/pii/0163638393800298