U.S. Natality 2003-2013

An Analysis of Births with Unemployment Rate By State & Age of Mother

By Daniel Dittenhafer


This project was undertaken as part of the Knowledge & Visual Analytics (IS608) course requirements for the Master of Science, Data Analytics program at City University of New York (CUNY).


The Centers for Disease Control (CDC), under the United States Department of Health and Human Services (USHHS), compiles and publishes data sets containing birth and related data points for the entire United States. This data is derived from data provided to the National Vital Statistics System by the National Center for Health Statistics (NCHS) and the States where the data originates.

Based on the CDC data, the United States birth rate reached a peak in 2007 of 14.33 births per 1,000 people or 4,316,233 total births. Since that time, both total births and the birth rate have fallen. The purpose of the analysis contained herein is to dive deeper into this phenomenon and possibly reveal some new information regarding the underlying details.

Please note that birth rate and fertility rate, although similar, are not the same. The CDC defines these terms as follows:

Birth Rate
The number of births divided by total population in the given year(s).
Fertility Rate
The number of births divided by the number of females age 15 - 44 years old in the given year(s).

For the purposes of the current analysis, official population estimates for each year were acquired from the United States Census Bureau and joined with the CDC birth data inorder to compute birth rates for the State and Age of Mother categories 1 2.

Birth counts by Year, Month, State and Mother's Age were acquired from the CDC WONDER website 3 4.

The Bureau of Labor Statistics (BLS), under the United States Department of Labor, compiles unemployment data for the nation as a whole, as well as local areas. The state specific unemployment rates for the period 2003-2013 were acquired from BLS Local Area Unemployment Statistics 6.

It should be noted that the CDC removes some values from the data sets for privacy reasons. The following quote is taken from their dataset documentation page.

Vital statistics data are suppressed due to confidentiality constraints, in order to protect personal privacy. The term "Suppressed" replaces sub-national births counts, birth rates and fertility rates, when the figure represents zero to nine (0-9) persons.

Assurance of Confidentiality Constraints, CDC WONDER

As a result of the confidentiality constraints, summation of sub-category birth counts typically does not equal the national leval aggregate value. The CDC Data Use Restrictions with regard to these confidentiality constraints also generally prevent disclosure of the specific differences between sub-category births and the national aggregates.

The motion bubble chart, shown above (requires Adobe Flash), initially shows the snapshot of birth rate and unemployment rate per State in early August 2007, at the peak of birth activity during the analyzed period. This is also just before the official start of the recession in December 2007 7. By pressing the play button in the lower left hand corner of the chart, you can watch as the recession hits, unemployment surges and births reduce off their peaks. You may notice there is not a clear relationship between births and unemployment. Indeed, while many states have overall reduced births during the high unemployment period of 2009-2010, other states such as Vermont tend to maintain their birth rate during this time. While the increase in births may be related to the business cycle's expansion, it is not as clear that the reduction in births is specifically related to the rise in unemployment rate 8.

The line charts above shows the births and birth rate per 1000 people split up by the age of the mother. Some interesting changes can be seen though this visualization.

  • For births to mothers age 15-19, a peak in August 2007 at 40,340 births (0.13 births/1000) begins a steady decline to 24,316 (~0.08 births/1000) in August 2013, a 39% drop for this age group. This is significant because it suggests high school pregnancies have been declining in recent years.
  • Prior to March 2010, mothers 20-24 were second only to the 25-29 group in birth rate, but by March 2011, the 30-34 group's birth rate has well surpased the 25-29 year olds.
  • It appears mothers 20-29 contributed the most to the increase in births in the 2006-2007 time period.
When comparing to a prior analysis of national birth and unemployment data, the State chart appears to be largely consistent with national trends during the same period. None of the states are obvious outliers deviating from the general trend of slowly reducing births from their 2007 peaks. On the other hand, the Age chart is a bit more interesting. The decrease in births to mothers in the 15-19 age group is significant, but not obviously related to the recession per se. Still, it may have been instigated by a cultural shift in this age group resulting from the recession. Likewise, the flip-flop of the 20-24 and 30-34 age groups beginning in early 2011 may also signal a cultural shift of some kind. A future analysis into these groups and other factors related to them might reveal some insight, but that is beyond the scope of this project.
The raw data used to produce the visualizations can be download via the following links: More information about the data tidying, merging and conversion steps used to produce the JSON files linked above can be found in the Data Preparations R Markdown document and in the Project GitHub repository.
The following web and visualization technologies were used throughout this project:
