This is a reply within a larger thread: view the whole thread

Re: Popularity Analysis Question
I once computed the entropy of the US names from SSA, and there is an increase from 8.2 bit in 1880 to 11.5 bit in 2015. The increase wasn't monotonic: At the end of the 1930ies the entropy temporarily decreased. Of course, the values aren't exact (because of the cut-off in the SSA lists at 5 babies per gender and year), but the trend is clear: Names become more evenly distributed and more informative over time. While the total number of name types seems to reach a culmination point in 2008, the entropy continues to rise after that year.
vote up4vote down

Replies

That's really cool!What do you mean by entropy? Is that a measure of how many different names are used, or of how different they are from each other somehow?
vote up1vote down
After googling some more it seems it's a measure of the number of ways to arrange a system, so that should be similar to the number of different names. I think. Is that right?
vote up1vote down
Intuitively, entropy measures the deviation from a uniform distribution, it is maximal with a perfect uniform distribution, and goes to zero with a strongly peaked distribution. It also grows with the number of available names, but this dependence is rather weak and if one wants to get completely rid of it one can normalise the entropy, the resulting measure is that known as Shannon Equitability Index with a range from 0 (all in one peak) to 1 (perfect uniform distribution).
vote up1vote down