STEM Diary: Lebesgue's Theory of Real Analysis III: Applications in Stochastic Analysis & Monte Carlo Integration

"I have to pay a certain sum, which I have collected in my pocket. I take the bills and coins out of my pocket and give them to the creditor in the order I find them until I have reached the total sum. This is the Riemann integral. But I can proceed differently. After I have taken all the money out of my pocket I order the bills and coins according to identical values and then I pay the several heaps one after the other to the creditor. This is my integral." - Henri L. Lebesgue

Author's Commentary (Hong Kong China, 10/03/2026): I wish I could've published more to demonstrate productivity, but between the hackathon, three tests, one presentation (essentially a test), a leadership training camp, and more, it is hectic.

3 Applications in Stochastic Analysis: Markov's Inequality, Chebyshev's Inequality, & Monte Carlo Integration

But, finally, we want to illustrate the utilitarian aspects of the Lebesgue integral in a fashion that the Riemann integral cannot - in the example of the ubiquitous Monte Carlo integration. Prior, we provide a number of prerequisites in,

Theorem 3.1 (Markov's Inequality) Given a real random variable $X$ and any arbitrarily small real $\epsilon > 0$, if we have that $f$ is a real non-negative monotonically increasing function that maps from $[0, \infty]$ to $[0, \infty]$, then the following Markov's inequality always hold,

$$P\left( \left| X \right| \geq \epsilon \right) \leq \frac{E \left[ f \left( \left| X \right| \right) \right]}{f(\epsilon)}$$

PROOF. First, for arbitrarily small real $\epsilon > 0$, consider the following inequality,

$$E[f(|X|)] \geq E \left[ f( \left| X \right| ) \mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}} \right]$$

Where in the above, $\mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}}$ denotes the indicator function as defined on the set of objects of the domain of $f$ such that $f( \left| X \right| ) \geq f(\epsilon)$. Since such a set of objects is a subset of the domain of the function, and since $f$ is a real non-negative function, the above inequality is clearly satisfied. Then, due to the fact that $f$ is a non-negative monotonically increasing function, we have that $f(\epsilon)$ is a lower bound for $f$ for the set of objects of the domain of $f$ such that $f( \left| X \right| ) \geq f(\epsilon)$, and so,

$$E \left[ f( \left| X \right| ) \mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}} \right] \geq E \left[ f(\epsilon) \mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}} \right]$$

Then, finally, we have,

$$\begin{align} E \left[ f(\epsilon) \mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}} \right] &= \int f(\epsilon) \mathbb{1}_{\{ f( \left| X \right| ) \geq f(\epsilon) \}} dP \\ &= f(\epsilon) P \left( \left| X \right| \geq \epsilon \right) \end{align}$$

We note again that in the above proof, we have identified measurable subsets of the sigma-algebra consisting of exactly those objects that all satisfy the same properties, where, in the above case, the property was provided by $f( \left| X \right|) \geq f(\epsilon)$ (refer to the quote at the beginning of this post).

Theorem 3.2 (Chebyshev's Inequality) For $X$, $\epsilon$, and $f$ as in Theorem 3.1, if $X \in \mathcal{L}^2$ with respect to an appropriate probability measure as defined on the real-line, then the following Chebyshev's inequality always hold,

$$P \left( \left| X - E[X] \right| \geq \epsilon \right) \leq \frac{\text{Var}[X]}{\epsilon^2}$$

PROOF. Chebyshev's inequality can be given as almost an immediate consequence of Markov's inequality, where, we opt for $f(x) = x^2$ in Markov's inequality, and so we obtain,

$$P\left( \left| X \right| \geq \epsilon \right) \leq \frac{E \left[ \left| X \right|^2 \right]}{\epsilon^2}$$

Finally, it suffices to replace $X$ with $X' - E[X']$, and to recall the identity $\text{Var[X]} = E \left[ \left( X - E[X] \right)^2 \right]$. $\square$

Theorem 3.3 (Strong Law of Large Numbers) Given a set of real random variables $\{ X_n \}$ with $X_n \in \mathcal{L}^2$ for all $n \in \mathbb{N}$, are pairwise independent, and are identically distributed, we have that the following is satisfied,

$$P \left[ \limsup_{N \rightarrow \infty} \left| \frac{\sum_{n = 1}^N \left( X_n - E[X_n] \right)}{N} \right| = 0 \right] = 1$$

PROOF. Without losing generality, we suppose that the random variables $\{ X_n \}$ for all $n \in \mathbb{N}$ are non-negative. For arbitrary real $\epsilon > 0$, we set $k_n = \lfloor (1 + \epsilon)^n \rfloor$ where $\lfloor \cdot \rfloor$ denotes the lower integer floor, and consider,

$$\sum_{n = 1}^{\infty} P \left[ \left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| \geq (1 + \epsilon)^{-\frac{n}{4}} \right]$$

An application of Chebyshev's inequality in Theorem 3.2 then gives an upper bound in,

$$\begin{align} \sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{\frac{n}{2}} \text{Var} \left[ \frac{X_1 + \dots + X_{k_n}}{k_n} \right] &= \sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{\frac{n}{2}} \frac{1}{k_n^2} \text{Var} \left[ X_1 + \dots + X_{k_n} \right] \\ &= \sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{\frac{n}{2}} \frac{1}{k_n} \text{Var} \left[ X_1 \right] \\ &< 2 \sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{-\frac{n}{2}} \text{Var} \left[ X_1 \right] \end{align}$$

Where the second equality above is justified as the random variables $\{ X_i \}$ are pairwise independent and therefore uncorrelated. Furthermore, since $\sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{-\frac{n}{2}}$ provides a convergent geometric series for $(1 + \epsilon)^{-2} < 1$ and since $\text{Var}[X_1]$ is bounded as $X_1$ belong to $\mathcal{L}^2$, we have boundedness. That is, we have that,

$$\sum_{n = 1}^{\infty} \left( 1 + \epsilon \right)^{\frac{n}{2}} \text{Var} \left[ \frac{X_1 + \dots + X_{k_n}}{k_n} \right] < \infty$$

Implying, more importantly,

$$\sum_{n = 1}^{\infty} P \left[ \left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| \geq (1 + \epsilon)^{-\frac{n}{4}} \right] < \infty$$

By an application of the Borel-Cantelli lemma,$^{[19]}$ we then conclude that, for any arbitrarily small $\epsilon'$, there exists some $N \in \mathbb{N}$ such that for all $n > N$, we have that subsets of the sigma-algebra satisfying $\left( \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right) \geq (1 + \epsilon)^{-\frac{n}{4}}$ are sets probability measure less than $\epsilon'$, and so, we have, almost surely,

$$\left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| < (1 + \epsilon)^{-\frac{n}{4}}$$

In other words, almost surely, for $n \rightarrow \infty$, we obtain,

$$\limsup_{n \rightarrow \infty} \left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| = 0$$

Now, given some positive integer $l$, we find some $n$ such that $k_n \leq l \leq k_{n + 1}$ and where we also have that $k_{n + 1} = \lfloor (1 + \epsilon)(1 + \epsilon)^{n} \rfloor \leq (1 + \epsilon) \lfloor (1 + \epsilon)^{n} \rfloor + \epsilon \lfloor (1 + \epsilon)^n \rfloor = (1 + 2 \epsilon)k_n$ for sufficiently large $n$, which, when combined with $\limsup_{n \rightarrow \infty} \left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| = 0$, finally gives,

$$\begin{align} \limsup_{l \rightarrow \infty} \left| \frac{X_1 + \dots + X_l}{l} - E[X_1] \right| & \leq \limsup_{n \rightarrow \infty} \left| \frac{X_1 + \dots + X_{k_{n + 1}}}{k_{n}} - E[X_1] \right| \\ & \leq \limsup_{n \rightarrow \infty} \left| \left( 1 + 2 \epsilon \right) \left( \frac{X_1 + \dots + X_{k_{n + 1}}}{k_{n + 1}} \right) - E[X_1] \right| \\ & \leq \limsup_{n \rightarrow \infty} \left| \frac{X_1 + \dots + X_{k_{n + 1}}}{k_{n + 1}} - E[X_1] \right| + 2 \epsilon \limsup_{n \rightarrow \infty} \left| \frac{X_1 + \dots + X_{k_{n + 1}}}{k_{n + 1}} \right| \\ & \leq 2 \epsilon E[X_1] \end{align}$$

And the above holds almost surely. $\square$

As a result, we realize the following as almost an immediate application of the strong law of large numbers stated above, in the form of,

Theorem 3.4 (Monte Carlo Integration) For i.i.d. real random variables $\{ X_n \}$ defined on the unit interval $[0, 1]$ with uniform distributions on $[0, 1]$, we have that, almost surely,

$$\lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n = 1}^{N} f(X_n) = \int_{0}^{1} f(x) dx$$

PROOF. This is an immediate consequence of the strong law of large numbers provided in Theorem 3.3, since, for example, $E\left[ \frac{1}{N} \sum_{n = 1}^{N}f(X_n) \right] = \frac{1}{N} \sum_{n = 1}^{N}E[f(X_n)] = \frac{1}{N} \left( NE[f(X_1)] \right) = \int_{0}^{1} f(x) dx$. That is,

$$P \left[ \limsup_{N \rightarrow \infty} \left| \frac{1}{N} \sum_{n = 1}^N f \left( X_n \right) - \int_{0}^{1} f(x) dx \right| = 0 \right] = 1$$

$\square$

Thus completes our final result of this blog post. Indeed, in the construction that had led to the stochastic sampling technique in Monte Carlo integration, the definition of the Lebesgue integral had been utilized in such a fashion that it cannot be reduced into the Riemann integral, contrary to the case of the previous section. One can be reminded here yet again of the restrictions imposed by the Riemann integral, in that one requires a certain style of partitioning of the domain of the integrand in such a fashion that provided unambiguously are collections of non-intersecting, non-degenerate, intervals, as given one after the other - that is, the Riemann integral is not provided by linear combinations of functional values as defined on degenerate points of the domain. Even such techniques as simple as the stochastic uniform sampling of the unit interval reject such a possibility as the sets to be provided are points of the domain which are, at the very least, not non-degenerate intervals.

For visual intuition, I have provided a GIF below,

So, with the theory of Lebesgue integration, integration can really become as simple as arithmetic averages.

Footnotes

[19] By an application of the Borel-Cantelli lemma, we formally conclude that,

$$P \left[ \limsup_{n \rightarrow \infty} \left( \left| \frac{X_1 + \dots + X_{k_n}}{k_n} - E \left[ X_1 \right] \right| \geq (1 + \epsilon)^{-\frac{n}{4}} \right) \right] = 0$$

And so, by taking a measure-theoretic interpretation to probability theory, the influences of Borel, and Cantelli, can be felt.

References

Bogachev, V. I., & Smolyanov, O. G. (2020). Real and Functional Analysis. Springer Cham. https://doi.org/10.1007/978-3-030-38219-3

Courant, R. (1977). Dirichlet's Principle, Conformal Mapping, and Minimal Surfaces. Springer New York. (Original work published 1950). https://doi.org/10.1007/978-1-4612-9917-2

Courant, R., & Hilbert, D. (1989). Methods of Mathematical Physics: Volume I. Wiley-VCH Verlag GmbH & Co. KGaA. (Original work published 1953)

Courant, R., & John, F. (1989). Introduction to Calculus and Analysis: Volume I. Springer New York. (Original work published 1965). https://doi.org/10.1007/978-1-4613-8955-2

Jahnke, H. N. (2003). A History of Analysis. American Mathematical Society; London Mathematical Society. (Original work published 1999). https://doi.org/10.1090/hmath/024

Klenke, A. (2020). Probability Theory: A Comprehensive Review (3rd ed.). Springer Cham. https://doi.org/10.1007/978-3-030-56402-5

Rudin, W. (1987). Real & Complex Analysis (3rd ed.). McGraw-Hill.

Stein, E. M., & Shakarchi, R. (2005). Real Analysis: Measure Theory, Hilbert Spaces, & Integration. Princeton University Press.

Yosida, K. (1995). Functional Analysis. Springer Berlin. (Original work published 1980). https://doi.org/10.1007/978-3-642-61859-8

Zorich, V. A. (2016). Mathematical Analysis II (2nd ed., R. Cooke, & O. Paniagua, Trans). Springer Berlin. (Original work published 2012). https://doi.org/10.1007/978-3-662-48993-2

STEM Diary

Lebesgue's Theory of Real Analysis III: Applications in Stochastic Analysis & Monte Carlo Integration

3 Applications in Stochastic Analysis: Markov's Inequality, Chebyshev's Inequality, & Monte Carlo Integration

Footnotes

References

No comments:

Post a Comment

Variational Analysis and the Calculus of Variations I: An Application in Neurobiology, The Finite Element Method, and an Evolutionary Optimization Algorithm

Search This Blog