Ensemble randomized maximum likelihood (EnRML) is an iterative (stochastic) ensemble smoother, used for large and nonlinear inverse problems, such as history matching and data assimilation. Its current formulation is overly complicated and has issues with computational costs, noise, and covariance localization, even causing some practitioners to omit crucial prior information. This paper resolves these difficulties and streamlines the algorithm without changing its output. These simplifications are achieved through the careful treatment of the linearizations and subspaces. For example, it is shown (a) how ensemble linearizations relate to average sensitivity and (b) that the ensemble does not lose rank during updates. The paper also draws significantly on the theory of the (deterministic) iterative ensemble Kalman smoother (IEnKS). Comparative benchmarks are obtained with the Lorenz 96 model with these two smoothers and the ensemble smoother using multiple data assimilation (ES-MDA).

Ensemble (Kalman) smoothers are approximate methods used for data assimilation (state estimation in geoscience), history matching (parameter estimation for petroleum reservoirs), and other inverse problems constrained by partial differential equations. Iterative forms of these smoothers, derived from optimization perspectives, have proven useful in improving the estimation accuracy when the forward operator is nonlinear. Ensemble randomized maximum likelihood (EnRML) is one such method.

This paper rectifies several conceptual and computational complications with EnRML,
detailed in Sect.

The Gauss–Newton variant of EnRML
was given by

A Levenberg–Marquardt variant was proposed in the landmark paper of

An approximate version was therefore also proposed in which the prior mismatch term is omitted from the update formula altogether. This is not principled and severely aggravates the chance of overfitting and poor prediction skill. Therefore, unless the prior mismatch term is relatively insignificant, overfitting must be prevented by limiting the number of steps or by clever stopping criteria. Nevertheless, this version has received significant attention in history matching.

This paper revises EnRML;
without any of the above tricks,
we formulate the algorithm such that
there is no explicit computation of

The contributions of this paper (listed by the previous paragraph) are original
but draw heavily on the theory of
the IEnKS of

It is informally known that EnRML can be seen as a stochastic flavour of the IEnKS

Another notable difference is that
the IEnKS was developed in the atmospheric literature,
while EnRML was developed in the literature on subsurface flow.
Thus, typically,
the IEnKS is applied to (sequential) state estimation problems
such as filtering for chaotic dynamical systems,
while EnRML is applied to (batch) parameter estimation problems,
such as nonlinear inversion for physical constants and boundary conditions.
For these problems,
EnRML is sometimes referred to as the iterative ensemble smoother (IES).
As shown by

The improvements to the EnRML algorithm herein render it very similar to the IEnKS also in computational cost. It thus fully establishes that EnRML is the stochastic “counterpart” to the IEnKS. In spite of the similarities, the theoretical insights and comparative experiments of this paper should make it interesting also for readers already familiar with the IEnKS.

Randomized maximum likelihood

Consider the problem of
estimating the unknown, high-dimensional state (or parameter) vector

In the Bayesian paradigm,
prior information is quantified as a probability density function (PDF)
called the prior, denoted

The prior is assumed Gaussian (normal),
with mean

The observation error,

The Monte Carlo approach offers a
convenient representation of distributions as samples.
Here, the prior is represented by the “prior ensemble”,

Then, the

Finally, these log posteriors are minimized.
Using the Gauss–Newton iterative scheme (for example) requires (Eq.

Alternatively, by corollaries of the well-known Woodbury matrix identity,
the increments can be written in the “Kalman gain” form:

EnRML is an approximation of RML in which
the ensemble is used in its own update
by estimating

Computationally, compared to RML, EnRML offers the simultaneous benefits of working with low-rank representations of covariances and not requiring a tangent-linear (or adjoint) model. Both advantages will be further exploited in the new formulation of EnRML.

Concerning their sampling properties,
a few points can be made.
Firstly (due to the ensemble covariance),
EnRML is biased for finite

For convenience, define the concatenations:

Projections sometimes appear
through the use of linear regression.
We therefore recall

Now, denote

Similarly to Sect.

The ensemble estimates of

Note that the linearization (previously

The formula (Eq.

Suppose the ensemble is drawn from a Gaussian.
Then

Note that the computation (Eq.

The prior covariance estimate (previously

The ensemble estimates, Eq. (

In conclusion,
the likelihood increment (Eq.

In contrast to the likelihood increment (Eq.

Denote

Lemma 1 may be proven
by noting that

Moreover, using the ensemble coefficient vector (

Recall the definition of Eq. (

Inserting the regression

The symbol

Conversely,
Eq. (

The ensemble matrix of iteration

Appendix

A well-known result of

Consider applying the change of variables

The derivation summarized in the previous paragraph
is arguably simpler than that of the last few pages.
Notably,
(a) it does not require the Woodbury identity to derive the subspace formulae,
(b) there is never an explicit

While the case of a large ensemble (

The answers lie in understanding the linearization of
the map

Numerical experiments, as in Sect.

In square-root ensemble filters, the transform matrix should have

Numerically,
the use of

Irrespective of the inverse transform formula used,
it is important to retain all non-zero singular values.
This absence of a truncation threshold is a tuning simplification
compared with the old EnRML algorithm,
where

To summarize, Algorithm 1 provides pseudo-code
for the new EnRML formulation.
The increments

Line 6 is typically computed by solving

The Levenberg–Marquardt variant is obtained by adding the trust-region parameter

Localization may be implemented by local analysis

Inflation and model error parameterizations are not included in the algorithm
but may be applied outside of it.
We refer to

The new EnRML algorithm produces results that are

However, there do not appear to be any studies of EnRML with the
Lorenz 96 system

Comparison of the benchmark performance of EnRML
will be made to the IEnKS
and to ensemble multiple data assimilation (ES-MDA)

Note that this is MDA in the sense of

Their Lorenz 96 experiment only concerns the initial conditions.

Their Lorenz 96 experiment seems to have failed completely, with most of the benchmark scores (their Fig. 5) indicating divergence, which makes it pointless to compare benchmarks. Also, when reproducing their experiment, we obtain much lower scores than they report for the EnKF. One possible explanation is that we include, and tune, inflation.

The performances of the iterative ensemble smoother methods
are benchmarked with “twin experiments”,
using the Lorenz 96 dynamical system,
which is configured with standard settings

The iterative smoothers are employed in the sequential problem of filtering,
aiming to estimate

The methods are assessed by their accuracy,
as measured by the root-mean-square error (RMSE):

Benchmarks of filtering

A table of RMSE averages is compiled for a range of

For experiments with

The deterministic (square-root) IEnKS and ES-MDA
score noticeably lower RMSE averages than
the stochastic IEnKS (i.e. EnRML) and ES-MDA,
which require

Among the stochastic smoothers,
the one based on Gauss–Newton (EnRML)
scores noticeably lower averages than the one based on annealing (ES-MDA) when the nonlinearity is strong (

Furthermore, the performance of EnRML and IEnKS
could possibly be improved by lowering the step lengths
to avoid causing “unphysical” states
and to avoid “bouncing around” near the optimum.
The tuning of the parameter that controls the step length,
(e.g. the trust-region parameter and the MDA-inflation parameter)
has been the subject of several studies

This paper has presented a new and simpler (on paper and computationally) formulation of
the iterative, stochastic ensemble smoother known as ensemble randomized maximum likelihood (EnRML).
Notably, there is no explicit computation of the sensitivity matrix

The new EnRML formulation was obtained
by improvements to the background theory and derivation.
Notably,
Theorem 1 established the relation of
the ensemble-estimated, least-squares linear-regression coefficients,

The other focus of the derivation was rank issues,
with

The paper has also drawn significantly on the theory of
the deterministic counterpart to EnRML:
the iterative ensemble Kalman smoother (IEnKS).
Comparative benchmarks using the Lorenz 96 model
with these two and the ensemble multiple data assimilation (ES-MDA) smoother
were shown in Sect.

As mentioned in the paper, the experimental results may be reproduced with code hosted at

The posterior ensemble's covariance, obtained using the EnKF, has the same rank as the prior's, almost surely (a.s.).

For a deterministic EnKF,

For the stochastic EnKF, Eqs. (

We begin by writing

Now consider, for

A corollary of Theorem 2 and Lemma 1
is that the ensemble subspace is also unchanged by the EnKF update.
Note that both the prior ensemble and the model (involved through

We were not able to prove Conjecture 1,
but it seems a logical extension of Theorem 2
and is supported by numerical trials.
The following proofs utilize Conjecture 1,
without which some projections will not vanish.
Yet, even if Conjecture 1 should not hold
(due to bugs, truncation, or really bad luck),
Algorithm 1 is still valid and optimal,
as discussed in Sect.

The symmetry of

This proof was heavily inspired by appendix A of

Recall that Eq. (

In the case of

Let

Recall from Eq. (

The new and simpler EnRML algorithm was derived by PNR and further developed in consultation with GE (who also developed much of it independently) and ASS. Theorems 1 and 2 were derived by PNR, prompted by discussions with GE, and verified by ASS. The experiments and the rest of the writing were done by PNR and revised by GE and ASS.

The authors declare that they have no conflict of interest

The authors thank Dean Oliver, Kristian Fossum, Marc Bocquet, and Pavel Sakov for their reading and comments and Elvar Bjarkason for his questions concerning the computation of the inverse transform matrix.

This work has been funded by DIGIRES, a project sponsored by industry partners and the PETROMAKS2 programme of the Research Council of Norway.

This paper was edited by Takemasa Miyoshi and reviewed by Marc Bocquet, Pavel Sakov, and one anonymous referee.