Law of Large Numbers, Visualized

As a follow up on the post “How Common Is Your Birthday – 360 degrees” we are getting our own data from OCCMundial to compare México with the US data from the NYTimes. We made several runs with different dataset sizes and, as a byproduct of this, we got a visualization that shows how the probability distribution of the birthday rank becomes apparent as the dataset size increases.

Check the visualization and source code (in Processing.js) here.

Rank for more common birthdays along the year. Whiter is highest rank, so most common birthday. Data for a random sample of 5,000, 100,000, 1.5 million and 4.5 million records. As dataset size increases, the real distribution becomes apparent. Data from México (OCCMundial).