Central Limit Theorem

Central Limit Theorem

When adding random variables to a formula, expect the outcome to reflect a normal distribution.

Scenario 1: Billy doesn’t have many friends, so he opens a phonebook trying to find people willing to attend his birthday party. The first 99 people he calls quickly turn him down, saying no. But the last person he managed to reach said, “sorry, but I’m busy, maybe next week.” After that, Billy decided to cancel his party, and instead start planning one for next week. If one person was available, after another hundred calls, he would certainly find at least another guest.

Scenario 2: It’s Tom’s turn to choose a piece of candy from the mystery jar. Nobody knows what flavors are in the jar, but from observing the last few people who picked a piece, Tom has determined that most of the flavors are good, a few of them are bad, and one of them is his absolute favorite. While the one he chooses isn’t guaranteed to be his favorite flavor, there’s a 100% chance that he’ll still be eating candy. I’d say the odds are in his favor.


The central limit theorem effectively implies that, in a majority of situations, when you add independent random variables to a formula, the sum will tend toward a normal distribution. This is often true even when the original variables used in the formula are not normally distributed.

In other words, given a large enough sample size from a population, the mean of the sample will be representative of the mean of the population. The theorem also states that the distribution of the sample and whole population will be similar. A reasonable sample size to apply the central limit theorem to is considered to be between 30 and 40 units.

As the number of variables grows, the variances will still tend toward a normal distribution.

Keep in mind that, the central limit theorem does not apply in all situations. It will likely fail to apply when there is an infinite variance, or if the system being used is multiplicative, rather than additive. Likewise, the theorem will not apply to sample sizes that fall below about 30.


The central limit theorem has seen many iterations over the course of time, with the first version of the theorem dating back to 1810. The modern form of this theorem wasn’t precisely stated until around 1920. Once the central limit theorem was established, a bridge was erected between classical and modern probability theories.

The Dutch mathematician Henk Tijms posited on the history of this well important mental model:

The central limit theorem has an interesting history. The first version of this theorem was postulated by the French-born mathematician Abraham de Moivre who, in a remarkable article published in 1733, used the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. This finding was far ahead of its time, and was nearly forgotten until the famous French mathematician Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie analytique des probabilités, which was published in 1812. Laplace expanded De Moivre's finding by approximating the binomial distribution with the normal distribution. But as with De Moivre, Laplace's finding received little attention in his own time. It was not until the nineteenth century was at an end that the importance of the central limit theorem was discerned, when, in 1901, Russian mathematician Aleksandr Lyapunov defined it in general terms and proved precisely how it worked mathematically. Nowadays, the central limit theorem is considered to be the unofficial sovereign of probability theory.

The term “central limit theorem” itself was first used publicly as the title of a paper written in 1920 by George Pólya. The term central was applied to the theorem because the theorem is centrally important to probability theory.

The abstract of another paper written by Pólya, titled ‘On the central limit theorem of calculus of probability and the problem of moments’, makes the following point in its abstract:

The occurrence of the Gaussian probability density 1 = e−x2 in repeated experiments, in errors of measurements, which result in the combination of very many and very small elementary errors, in diffusion processes etc., can be explained, as is well-known, by the very same limit theorem, which plays a central role in the calculus of probability. The actual discoverer of this limit theorem is to be named Laplace; it is likely that its rigorous proof was first given by Tschebyscheff and its sharpest formulation can be found, as far as I am aware of, in an article by Liapounoff. …

The passage lays out the history of this theorem clearly. Attaching its roots to several historic figures.

Deep Analysis

If the central limit theorem applies, calculating the mean value for each sample then building a distribution from those mean values should lead to a normal result.

Using this theorem, you should be able to generalize conclusions about an entire population based on results found through analyzing a sample of the population. It is a powerful theorem because it allows you to make reasonable assumptions about a population regardless of what the initial distribution looks like.

There are endless applications to this theorem, including hypothesis testing, confidence intervals, and estimation.  

If you were to represent the Central Limit Theorem using an example, you should consider rolling a die. The more times you roll the die, the more likely the distribution will tend toward a normal distribution. Generally, once you’ve rolled the die at least 30-40 times, you should see a relatively normal distribution of variables.

The central limit theorem shows that the average of your sample means will be the population mean as well. Considering this, you can see how the central limit theorem can be used to predict the characteristics of a population rather accurately.


The central limit theorem is limited by the fact that you must have a sample size ranging from 30-40 units before the theorem can be applied.


Students can apply the central limit theorem to make observations about social circumstances, group activities, and their own academic success. Using the central limit theorem, you can determine what outcomes are attainable for you compared to your peers.

For example, if a majority of your classmates are failing Algebra, and you aren’t, there is a chance that a curve will be applied to make the grading system more balanced. The teacher may also add in additional variables, such as extra credit assignments and pop quizzes, to offer the other students more opportunities to pass the class.

Through the central limit theorem, you can assume that those additional assignments add new variables to the grading formula and that those variables will normalize across the distribution and allow some students to improve their grades. Though, if you perform poorly on those assignments, your high grade could be lowered toward the mean distribution.

Professional Sports

In the realm of professional sports, central limit theorem can be used to predict certain characteristics of successful players, as well as the likelihood of specific sports-related outcomes.


Businesses can use the central limit theorem to make observations about the market, their business itself, and more. With this theorem, business leaders can determine what their target audience likes, what they don’t like, and how to reach them effectively. That’s only one way the central limit theorem can be applied to business.  


The central limit theorem can be applied to determine how an asset or set of assets will perform over a period of time. Markets are typically unpredictable, but if you are looking to group investments based on specific performance metrics, the central limit theorem could be useful.

Explore More Models