STEM Diary: Variational Analysis and the Calculus of Variations I: An Application in Neurobiology, The Finite Element Method, and an Evolutionary Optimization Algorithm

Author's Commentary (Hong Kong China, 04/04/2026): These kinds of topics in intelligent computing and artificial intelligence have seriously affected my STEM outlook recently, and I am even more excited than previously. There are really so many topics, so many things I want to try, including topics related to the theory of optimal transport (i.e., Wasserstein distance) and more. My machine learning project on shoe brand classification in a machine learning course is going well, with an unoptimized model reaching 90% validation accuracy on just augmented data - thus, I am confident about 95%, since many, many methods were not implemented as I am only experimenting at the early phases with individual models and data processing.

0 Introduction

I still remember coming across a variational problem in a book on computational neurobiology$^{[1]}$, where the problem is to minimize the following expression,

$$E = \frac{1}{T} \int_{0}^{T} \left( r_0 + \int_0^{\infty} D(\tau)s(t - \tau) d\tau - r(t) \right)^2 dt$$

The equation appeared non-trivial, difficult even, and I thought that I would need to comb through a number of advanced mathematical texts on variational analysis and the classical calculus of variations. Not merely an integral, but the square of an expression containing an integral, inside another integral. Yet, the solution provided by the authors was straightforward, almost laughably so, and it is to reinterpret the integral as a sum, a discrete sum. Essentially, a finite difference approach (for integrals and thus Riemann sums), which allows the derivation of the following,

$$E \approx \frac{\Delta t}{T} \sum_{i = 0}^{\frac{T}{\Delta t}} \left( r_0 + \Delta t \sum_{k = 0}^{\infty} D_k s_{i - k} - r_i \right)^2$$

No longer concerned with integrals, or an integral of the square of an expression containing another integral, in order to investigate the minimum of the above, at the very least, all that is required are methods of multivariable calculus, being, to take the partial derivatives and to set them to 0.

Indeed, for variational analysis and the calculus of variations, arguably the most straightforward approach is the finite difference approach, and possibly the finite element approach (for the finite element approach, there is a generalization in the Rayleigh-Ritz method, where, instead of merely polynomials, one might opt for linear combinations of different kinds of base functions in appropriate function spaces), especially given the tremendous successes of the third industrial revolution and scientific computing. Just merely discussing finite difference methods, however, seems a little bit too simple. Thus, we provide more material, on combining the finite element method with finite differences, and in applying an evolutionary algorithm to FEM, to show that a rephrasing of variational problems, into discrete sums, may open many new interesting solution possibilities.

1 The Finite Element Method

The finite element method was not revealed in the introduction, in fact, it was just finite differences. For the finite element method to be implemented proper, it is important that the "candidate functions", those responsible for the variational problem, be expressed in terms of some linear combination of functions as selected from an appropriate basis of an appropriate function space. Conventionally, the finite element method insists on polynomials.

To try to manage ambiguities, we continue to engage with the expression in the introduction, and so we consider a potential trial solution in the form of linear combinations of polynomials, thus, consider (for $c_k$ real values and for $\phi_k$ monomials),

$$D(\tau) = \sum_{m = 1}^{M}c_m \phi_m(\tau)$$

For the finite difference approach, we substitute $\tau = k \Delta t$, and so,

$$D(k \Delta t) = \sum_{m = 1}^{M}c_m \phi_m(k \Delta t)$$

Thus,

\begin{align*} \Delta t \sum_{k = 0}^{K} D (k \Delta t) s_{i - k} &= \Delta t \sum_{k = 0}^{K} \sum_{m = 1}^{M}c_m \phi_m(k \Delta t) s_{i - k} \\ &= \sum_{m = 1}^{M} c_m \left( \sum_{k = 0}^K \phi(k \Delta t)s_{i - k} \Delta t \right) \\ &= \sum_{m = 1}^{M} c_m g_{i, m} \end{align*}

For simplicity, we have used $g_{i, m} = \sum_{k = 0}^K \phi(k \Delta t)s_{i - k} \Delta t$ in the above. We return to the variational problem,

$$E = \frac{1}{T} \int_{0}^{T} \left( r_0 + \int_0^{\infty} D(\tau)s(t - \tau) d\tau - r(t) \right)^2 dt$$

Which, instead of,

$$E \approx \frac{\Delta t}{T} \sum_{i = 0}^{\frac{T}{\Delta t}} \left( r_0 + \Delta t \sum_{k = 0}^{\infty} D_k s_{i - k} - r_i \right)^2$$

We proceed to,

$$E \approx \frac{\Delta t}{T} \sum_{i = 0}^{N - 1} \left( r_0 + \sum_{m = 1}^{M} c_m g_{i, m} - r_i \right)^2$$

A necessary condition for the extrema is then, for all appropriate $n$, $\frac{\partial E}{\partial c_m} = 0$, so we derive,

\begin{align*} \sum_{i = 0}^{N - 1} \left( r_0 + \sum_{m = 1}^{M} c_m g_{i, m} - r_i \right) g_{i, n} &= 0 \\ \sum_{i = 0}^{N - 1} \sum_{m = 1}^{M}c_m g_{i, m} g_{i, n} &= \sum_{i = 0}^{N - 1} \left( r_i - r_0 \right) g_{i, m} \end{align*}

And so, for $A_{nm} = \sum_{i = 0}^{N - 1}g_{i, m} g_{i, n}$ and $y_n = \sum_{i = 0}^{N - 1}(r_i - r_0) g_{i, n}$, the variational problem has been reduced into the following linear system,

$$\sum_{m = 1}^{M}A_{nm}c_m = y_n$$

And so, the algorithmic solution is simply this - once the discretization is finalized, we precompute $g_{i, m}$, compute $A_{n, m}$ and $y_n$, then solve for $c_m$, giving approximate solutions via $D(\tau) = \sum_{m = 1}^{M}c_m \phi_m(\tau)$.

2 An Evolutionary Optimization Algorithm as Applied to FEM

The finite element method demonstrates to be quite powerful, especially in conjunction with finite difference approaches, and we saw how, previously, such combinations of methods allow one to reduce a variational analytic problem into one of linear algebra. More, into a problem of zeroes of sets of polynomials. It is here where many, many methods become available and potentially non-trivial, where, although the convention is linear algebra and matrix theory, other methods, including those from computational algebraic geometry, may be practiced.

Given my recent self-study related to topics of artificial intelligence, an application of an evolutionary optimization algorithm will be demonstrated, thus, an example of intelligent optimization.

In the context of the previous problem as treated, instead of taking partial derivatives, we stop at the following juncture,

$$E \approx \frac{\Delta t}{T} \sum_{i = 0}^{N - 1} \left( r_0 + \sum_{m = 1}^{M} c_m g_{i, m} - r_i \right)^2$$

Of course, at the moment, we can't really carry an evolutionary algorithm explicitly as the problem is still stated generally (for instance, $r_i$). So, to simply demonstrate the principle and potential of evolutionary optimization algorithms, we generate synthetic data including for the firing rate $r_i$, where, we note that the firing rate is given approximately by (for $s(\cdot)$ denoting the stimulus),

$$r(t) \approx r_0 \int_{0}^{\infty} D(\tau) s(t - \tau) d\tau$$

And, indeed, the evolutionary algorithm was completely successful in evolving the correct approximate kernel, qualitatively, that allows us to approximate $r$ reasonably (where we note that the genotypes are provided by the coefficients $c_m$'s in recalling the finite element method). The steps can be quickly described as fitness evaluation, tournament selection, crossover, mutation, and an "elitism" for reproducing through multiple generations. The results are,

Yet again, convergence was extremely rapid, and the appropriate qualitative behavior was produced almost immediately. In fact, I had not even provided the initial conditions in the endpoints, as revealed by the following graphic of the ancestors,

And, for the graphic below, we see how the population converges (for two pieces of information of the genotype in $c_1$ and $c_2$),

We notice the same phenomenon as in a previous blog titled Machine Learning Mini-Project Series VII: The Neuroevolution Model - that is, the implementation of such a type of evolutionary algorithm (with tournament selection and elitism), for coefficients of polynomials the genotype, is exceptionally good at producing the right qualitative behavior initially, but not particularly good in inching its way towards the solution more finely. Clearly, a combination of methods, including evolutionary algorithms with gradient descent of finer step sizes, makes sense in many contexts.

Closing Remarks

I cannot help but recall the many mathematical problems that I had encountered previously, and how they can all be solved, essentially, via combinations of methods of finite differences, the finite element method, methods of intelligent computing, and artificial intelligence - this is not ad-hoc, not necessarily "brute force", even, but a comprehensive system consisting of an entire suite of mathematical methods.

For instance, many methods of artificial intelligence can be interpreted from a mathematical statistical perspective, or, a differential geometric perspective. And, methods of evolutionary algorithms can be interpreted, somewhat, from a schema theory perspective, or, game-theoretic perspective.

I used to think that signals and systems was so incredible in the way that it essentially provides a systematic means to approach features engineering in artificial intelligence, how ironic it is that many programs of computer science, data science, artificial intelligence, and the like, do not provide such education (signals and systems is typically taught to electronic engineering students, and I have enjoyed a significant proportion of my electronic and computer engineering program). How ironic it was, that many of the fundamental methods of artificial intelligence, of machine learning, of neural networks, and even of evolutionary algorithms and swarm intelligence, can be found in topics of signals and systems, control systems and control theory, especially in their humble origins.

The finite element method, on the other hand, also allows this due to its great generality (the Rayleigh-Ritz method is of course more general) - its extensively generality allows the approximation of many potential solutions to many problems of mathematics and, ultimately, in applied mathematics. In conjunction with intelligent computing as such topics as evolutionary computing, the potential is vast, and I will be explore such related topics further.

I have not felt this excited since my appraisal of the intersection of topics related to stochastic analysis, variational analysis, differential geometry, and the theory of optimal transport some time ago, when I was relatively deep in mathematics, isolated from topics of computing and artificial intelligence. All of those topics, it turns out, have incredible potential in intelligent computing and artificial intelligence.

Footnotes

[1] Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems by Dayan and Abbott.

References

Courant, R., & Hilbert, D. (1962). Methods of Mathematical Physics (Vol. 1). Interscience Publishers.

Dayan, P., & Abbott, L. F. (2001). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press.

Gelfand, I. M., & Fomin, S. V. (1963). Calculus of Variations (Rev. English ed.). Prentice-Hall.

Simon, D. (2013). Evolutionary Optimization Algorithms: Biologically-Inspired and Population-Based Approaches to Computer Tntelligence. John Wiley & Sons Inc.

STEM Diary

Variational Analysis and the Calculus of Variations I: An Application in Neurobiology, The Finite Element Method, and an Evolutionary Optimization Algorithm