Skip to content

Monte Carlo

Published:

All of us are familiar with using random variables in experiments. Most of us have probably played with dice and coins. Many of us have probably written down the results of tossing a die several times to calculate the probability of getting a particular side, and noticed that as the number of trials increases, the result approaches the theoretical value (1/6 for a die, 1/2 for a coin). The courius thing here is that we can use the the same techniques to compute complex calculations.

One of the most interesting (and useful) problems in mathematics is evaluating integrals. In physics, the majority of integrals are impossible to solve analytically, so numerical methods are required. Numerical methods are fascinating in general, but today I want to highlight a particularly powerful technique called the Monte Carlo method.

In the most general sense, the Monte Carlo method is a statistical (almost experimental) approach to computing integrals using random points, called samples, whose distribution is carefully chosen. Today, I want to show you how important the way you generate samples is. I will show two methods: Direct Sampling and Markov-chain Sampling. To illustrate the point, I will make use of a very common use case: computing π\pi.

Direct Sampling

Imagine a circle inscribed within a square of side length 2R2R (we’ll ignore units, and let the square range from R-R to RR for simplicity). The area of the square is Asquare=(2R)2=4R2A_{\text{square}} = (2R)^2 = 4R^2. The area of the circle is Acircle=πR2A_{\text{circle}} = \pi R^2.

In this setup, you can randomly throw points onto the square and count them. By construction, every random point will land inside the square, so the total number of points you throw is NtotalN_{\text{total}}. Next, count how many of these points land inside the circle, NcircleN_{\text{circle}}. That’s all you need to estimate π\pi! Notice that the ratio of the areas should match the ratio of the number of points:

AcircleAsquare=NcircleNsquare\dfrac{A_{\text{circle}}}{A_{\text{square}}} = \dfrac{N_{\text{circle}}}{N_{\text{square}}}

Substituting the values:

πR24R2=NcircleNsquare\dfrac{\pi R^2}{4R^2} = \dfrac{N_{\text{circle}}}{N_{\text{square}}}

Canceling RR and moving the 44 to the right side, we get:

π=4NcircleNsquare\pi = 4 \dfrac{N_{\text{circle}}}{N_{\text{square}}}

That’s it! You can generate random numbers between R-R and RR and count the points. Keep in mind that every trial is independent, so you could generate all the samples at once with a random number generator (e.g., NumPy’s random library).

A good question here is: how many trials do we need to get a good approximation of π\pi? The value becomes exact only in the limit of an infinite number of trials.

Here is a virtual experiment. Feel free to change the number of trials and play with this direct sampling simulation.

Points Inside Circle:
0
Total Points:
0
Estimated π:
0.000000
Actual π:
3.141593
Relative Error:
100.000000%
Points inside circle
Points outside circle

Markov-chain Sampling

A Markov chain is defined as a process that transitions between states, where the probability of moving to the next state depends solely on the current state and not on any of the previous states.

For our experiment, this means that starting from a point (for example, (1,1)(-1, -1)), you compute the next point by moving a small amount:

xi+1=xi+Δxx_{i+1} = x_i + \Delta x

yi+1=yi+Δyy_{i+1} = y_i + \Delta y

where:

Δxran(δ,δ)\Delta x \leftarrow \text{ran}(-\delta, \delta)

Δyran(δ,δ)\Delta y \leftarrow \text{ran}(-\delta, \delta)

Here, δ\delta is a parameter that we need to tune.

What should we do if our new point (xi+1,yi+1)(x_{i+1}, y_{i+1}) ends up outside the square? There are two simple choices:

  1. Ignore that move and try generating a new one instead.
  2. Stay at the current point and count it again, without making a move.

We should choose option 2 because it keeps the sampling fair. If you use option 1 instead, points near the edges get pushed back toward the center—since many out-of-bounds attempts are discarded—so you end up oversampling the interior and undersampling the borders. Staying put on rejections treats these failed attempts as real outcomes, which preserves the correct long-run distribution. In contrast, ignoring and redrawing changes the transition probabilities near the boundary and introduces bias.

How should you choose δ\delta? The rule of thumb is to select δ\delta so that it is neither too small (which results in small steps and a high acceptance rate) nor too large (which would mean most moves fall outside the square, leading to low acceptance rates and, again, small effective paths). Ideally, you want the acceptance rate to be around 12\frac{1}{2}, that is, about half of the attempted moves are accepted.

Try this interactive experiment below. You can modify the number of trials and the initial point, and explore how the Markov-chain simulation behaves.

Points Inside Circle:
0
Total Points:
0
Estimated π:
0.000000
Actual π:
3.141593
Relative Error:
100.000000%
Inside circle
Outside circle

Conclusion

The Monte Carlo experiments illustrate two fundamental ways of sampling a probability distribution p(x,y)p(x, y), whether the space is discrete or continuous: direct sampling and Markov-chain sampling. In both approaches, we compute the value of an observable (a function) O(x,y)O(x, y), in our example, 1 if the point is inside the circle and 0 otherwise. In both cases, we evaluate:

NcircleNsquare=1Ni=1NOiO=1111p(x,y)O(x,y)dxdy1111p(x,y)dxdy\dfrac{N_{\text{circle}}}{N_{\text{square}}} = \dfrac{1}{N}\displaystyle\sum_{i=1}^{N} {O_i} \approx \langle O \rangle = \dfrac{\int_{-1}^{1}\int_{-1}^{1}p(x,y)O(x,y)\,dx\,dy}{\int_{-1}^{1}\int_{-1}^{1}p(x,y)\,dx\,dy}

The right-hand side is an integral, while the left-hand side is the result of sampling. Notice that the distribution p(x,y)p(x, y) no longer appears explicitly on the left: instead of being evaluated directly, it is sampled! This is the essence of the Monte Carlo method. Thanks to this approach, the complex integrals on the right are replaced by straightforward sampling. This means the Monte Carlo method enables us to evaluate high-dimensional integrals, making it especially powerful in statistical physics and related fields.