https://github.com/hadley/data-baby-names
Distribution of US baby names, 1880-2008
https://github.com/hadley/data-baby-names
Last synced: 9 months ago
JSON representation
Distribution of US baby names, 1880-2008
- Host: GitHub
- URL: https://github.com/hadley/data-baby-names
- Owner: hadley
- Created: 2009-05-15T15:24:57.000Z (over 16 years ago)
- Default Branch: master
- Last Pushed: 2018-07-04T18:01:39.000Z (over 7 years ago)
- Last Synced: 2025-03-24T09:07:17.396Z (10 months ago)
- Language: R
- Homepage:
- Size: 3.58 MB
- Stars: 221
- Watchers: 18
- Forks: 135
- Open Issues: 1
-
Metadata Files:
- Readme: readme.markdown
Awesome Lists containing this project
README
US Baby names 1880-2009
=======================
Data
----
[baby-names.csv](http://github.com/hadley/data-baby-names/raw/master/baby-names.csv) contains the top 1000 girl and boy baby names from 1880 to 2009. This data was aggregated from the data made available from the [social security administration](http://www.ssa.gov/OACT/babynames/). If you want to recreate it yourself, run the files `1-download.r`, `2-parse.rb` and `3-clean.r` in order. You will need both R and ruby.
Percent of names in top 1000
----------------------------

Since the 1960's the percentage of babies with names in the top 1000 has been shrinking, to it's current level of 80% of boys and 67% of girls.
Last letters
-------------
Stimulated by the discussion on [Andrew Gelman's blog](http://www.stat.columbia.edu/~cook/movabletype/archives/2009/05/where_all_boys.html) (prompted by an old post of the [baby name wizard blog](http://www.babynamewizard.com/archives/2007/7/where-all-boys-end-up-nowadays)) here are plots showing the distribution of last letter of names, 1880-2008.
