Popularity Analysis Question
Anyone know of existing analyses of US name popularity that show "X% of people born in this year had names in the top 10/50/100/500/1000"? From looking at percentages at a glance it seems that more people now are being named unusual names than in the 50s but I'm interested in seeing more details on that.
Replies
I once computed the entropy of the US names from SSA, and there is an increase from 8.2 bit in 1880 to 11.5 bit in 2015. The increase wasn't monotonic: At the end of the 1930ies the entropy temporarily decreased. Of course, the values aren't exact (because of the cut-off in the SSA lists at 5 babies per gender and year), but the trend is clear: Names become more evenly distributed and more informative over time. While the total number of name types seems to reach a culmination point in 2008, the entropy continues to rise after that year.
That's really cool!
What do you mean by entropy? Is that a measure of how many different names are used, or of how different they are from each other somehow?
What do you mean by entropy? Is that a measure of how many different names are used, or of how different they are from each other somehow?
After googling some more it seems it's a measure of the number of ways to arrange a system, so that should be similar to the number of different names. I think. Is that right?
Intuitively, entropy measures the deviation from a uniform distribution, it is maximal with a perfect uniform distribution, and goes to zero with a strongly peaked distribution. It also grows with the number of available names, but this dependence is rather weak and if one wants to get completely rid of it one can normalise the entropy, the resulting measure is that known as Shannon Equitability Index with a range from 0 (all in one peak) to 1 (perfect uniform distribution).