After reading the seven books in Isaac Asimov’s Foundation series, I did a little light follow-up reading on mathematical prediction and modeling of group actions. “Modeling Civil Violence: An Agent-Based Computational Approach” from The Brookings Institution was interesting.
Also interesting was “Seeing Around Corners,” by Jonathan Rauch, in the April 2002 “The Atlantic.” His analyses ranges from racial housing trends, the demise of the Anasazi, battling corruption (public high-profile arrests work best), and Zipf’s Law.
[From the article:] Every so often scientists notice a rule or a regularity that makes no particular sense on its face but seems to hold true nonetheless. One such is a curiosity called Zipf’s Law. George Kingsley Zipf was a Harvard linguist who in the 1930s noticed that the distribution of words adhered to a regular statistical pattern. The most common word in English-”the”-appears roughly twice as often in ordinary usage as the second most common word, three times as often as the third most common, ten times as often as the tenth most common, and so on. As an afterthought, Zipf also observed that cities’ sizes followed the same sort of pattern, which became known as a Zipf distribution. Oversimplifying a bit, if you rank cities by population, you find that City No. 10 will have roughly a tenth as many residents as City No. 1, City No. 100 a hundredth as many, and so forth. (Actually the relationship isn’t quite that clean, but mathematically it is strong nonetheless.) Subsequent observers later noticed that this same Zipfian relationship between size and rank applies to many things: for instance, corporations and firms in a modern economy are Zipf-distributed.
Now, that sounded rather strange to me, so I decided to test it.
Data from the U.S. Census Bureau’s Year 2005 List of Cities With Over 100,000 Population.
Los Angeles, California is ranked 2nd in population. So, according to Zipf’s law, Los Angeles should be about 1/2 of the population of the most populous, New York, NY. A Zipf Distribution would expect 50%, and it’s actually 47.2%, which is only 2.8% off.
Chicago, Illinois is ranked 3rd in population. So, according to Zipf’s law, Chicago should be about 1/3 of the population of the most populous, New York, NY. A Zipf Distribution would expect 33.3%, and it’s actually 34.9%, which is only 1.6% off.
The general trend continues, as you can see from the graph below.
St. Louis, Missouri is ranked 52nd in population. So, St. Louis should be about 1/52 of the population of NY. 156,600, though the actual population is 344,362. As a percentage of total U.S. population, that’s a very slight difference, but locally, that’s only 45.5% of our actual population.
Table of U.S. Cities with over 100,000 Population, Compared to expected Zipf Distribution
Red line is Zipf Distribution; Blue line is actual.