The analysis in nonlinear variational data assimilation is the solution of a
non-quadratic minimization. Thus, the analysis efficiency relies on its
ability to locate a global minimum of the cost function. If this minimization
uses a Gauss–Newton (GN) method, it is critical for the starting point to be
in the attraction basin of a global minimum. Otherwise the method may
converge to a

We generalize this approach to four-dimensional strong-constraint nonlinear ensemble variational (EnVar) methods, which are based on both a nonlinear variational analysis and the propagation of dynamical error statistics via an ensemble. This forces one to consider the cost function minimizations in the broader context of cycled data assimilation algorithms. We adapt this QS approach to the iterative ensemble Kalman smoother (IEnKS), an exemplar of nonlinear deterministic four-dimensional EnVar methods. Using low-order models, we quantify the positive impact of the QS approach on the IEnKS, especially for long data assimilation windows. We also examine the computational cost of QS implementations and suggest cheaper algorithms.

Data assimilation (DA) aims at gathering knowledge about the state of a system from acquired observations. In the Bayesian framework, this knowledge is represented by the posterior probability density function (pdf) of the system state given the observations. A specificity of sequential DA is that observations are not directly available; they become available as time goes by. Thus, the posterior pdf should be regularly updated.

In order to do so, one usually proceeds in two steps: the analysis and the propagation (or forecast). During the analysis step, a background pdf is used as a prior together with the observation likelihood to construct the (often approximate) posterior pdf, following Bayes' theorem. During the propagation step, this posterior pdf is propagated in time with the model to yield the prior pdf of the next assimilation cycle.

In general these posterior and prior pdfs are not easily computable. In the Kalman filter, assumptions are notably made on the linearity of operators, to keep these pdfs Gaussian; this way, they are characterized by their mean and covariance matrix. Linear algebra is then sufficient to enforce both Bayes' theorem and the propagation step into operations on means and covariances.

However, with nonlinear models, the Kalman filter assumptions do not hold as
the posterior and prior pdfs are not Gaussian anymore. A possibility in this
case is to enforce Gaussianity with approximations. This requires the
selection of mean and covariances intended for the Gaussian surrogate pdfs.
With the Kullback–Leibler divergence, the best Gaussian approximation of a
pdf is achieved by equating the mean and covariances

In the 4D-Var algorithm

Unfortunately, the Gauss–Newton method's ability to locate the global minimum depends on the minimization starting point and the cost function properties. Furthermore, missing this global minimum is likely to cause a quick divergence (from the truth) of the sequential DA method. Thus, it is critical for the assimilation algorithm to keep the minimization starting point in a global minimum basin of attraction.

Keeping the minimization starting point in a global minimum basin of attraction
is constraining because, with a chaotic model, the number of
local minima may increase exponentially with the data assimilation
window (DAW) time extent

On the one hand, 4D-Var benefits from the QS approach to approximate the
posterior and prior means. On the other hand, with traditional 4D-Var, the
prior covariance matrix is taken as static. This is appropriate when only one
cycle of assimilation is considered. But this limits the dynamical transfer
of error statistics from one cycle to the next, for instance when

By contrast, four-dimensional ensemble variational (EnVar) schemes allow one
to perform both a nonlinear variational analysis and a propagation of
dynamical errors via the ensemble

The iterative ensemble Kalman smoother (IEnKS)

The IEnKS improves the DA cycling by keeping track of the pdfs' mean and covariance matrix. To do this, a Laplace approximation is used to replace the posterior mean and covariance matrix with the minimizer of the cost function and an approximation of the inverse Hessian at the minimizer, respectively. These moments are then used to update the ensemble statistics. The updated ensemble is next propagated to estimate the prior mean and covariance matrix. Hence, it is also critical for the IEnKS to locate the global minimum of the cost function.

Here, we are interested in the application of the QS minimization to the
IEnKS. One of the variant of the IEnKS called the multiple data
assimilation (MDA) IEnKS was shown

The rest of the paper is organized as follows. In Sect.

We emphasize that the algorithmic developments of this study are not meant to
improve either high-dimensional or imperfect model data assimilation
techniques. Even if

After reviewing 4D-Var (in a constant background matrix version) and the IEnKS algorithms, we investigate the dependency of assimilation performance on the DAW key parameters. This illustrates the cycling improvement brought in by the IEnKS compared to 4D-Var. We will see that, with chaotic models, the longer the DAW is, the better the accuracy of these algorithms. Which highlights the QSVA relevance for cycled data assimilation.

The evolution and observation equations of the system are assumed to be in
the following form:

Both 4D-Var and the IEnKS use a variational minimization in their analysis
step. The objective of this minimization is to locate the global maximum of
the posterior pdf

Schematic of a DAW. The state variable at

To specify this posterior pdf, we have to make further assumptions on

The initial state

Following these assumptions, an analytic expression can be obtained for the
posterior smoothing pdf

The propagation corresponds to a time shift of

The 4D-Var cost function at the

The IEnKS

At the

Chaining of the 4D-Var and IEnKS first two cycles
with

In order to evaluate the efficiency of 4D-Var and the IEnKS with DAW
parameters

At the

The RMSE rigorously depends on the random variable realizations, and thus it
is also a random variable. In our numerical experiments, as is usually the case,
the RMSE is averaged over the cycles to mitigate this variability:

In order to obtain analytical expressions of the eMSE for 4D-Var and the IEnKS, we make drastic simplifying assumptions.

First, the model is assumed to be the resolvent of a linear, diagonal,
autonomous ordinary differential equation. Thus, it can be expressed as

Concerning the IEnKS, the anomalies are assumed to be full rank to avoid any
complication due to singular covariance matrices. Moreover, the linearity of
the model is employed to express the background statistics:

The
fact that the errors lie in the unstable subspace is more general.

In the following, we study the eMSE dependency on the DAW parameters,

The asymptotic smoothing

Specifically, the eMSE expression for the IEnKS is of the form

Concerning 4D-Var, we assume

To qualify the long term impact of the cycling on the errors, the filtering
eMSE is more instructive than the smoothing eMSE. Indeed, the smoothing eMSE
is improved with

Figure

The asymptotic eMSEs as a function of the

these curves are similar to those of

In this section, we studied the accuracy of both the

The results of Sect.

However, with a chaotic model,

This behavior will be illustrated with the Lorenz 95 (L95) model

Figure

Cost functions of the IEnKS projected in one
direction of the analyzed ensemble (hence centered and normalized) with
various DAW parameters

The curves have more and more local extrema as

This hilly shape causes minimization problems. A possible minimization
procedure for the IEnKS is the Gauss–Newton (GN) algorithm

First, the GN convergence properties are drastically simplified. We assume the method converges to the global minimum if, and only if, the minimization starting point is in a neighborhood of the global minimizer where the IEnKS cost function is almost quadratic.

Unfortunately, this minimizer is unknown because the cost function depends on
realizations of many random variables. In order to eliminate this
variability,

In the univariate case, if the model behavior is almost linear and unstable,
we can use Eq. (

To apply this inequality to the L95 model we choose the following:

Figure

Smoothing aRMSE

Figure

However, for small values of

We have seen in the previous section that the effective DAW length is constrained by the cost function non-quadraticity. In this section we review and propose algorithms able to overcome these minimization issues and reach longer DAWs.

Quasi-static algorithms were introduced by

Schematics of an IEnKS

This procedure is directly applicable to the IEnKS cost function
minimization. The left panel in Fig.

The first three lines initialize the minimization starting point, the
ensemble mean and anomaly matrix. The

To cycle the scheme, the DAW is then shifted with a small

To keep this cost function deformation statistically consistent,

An alternative is to keep

The success of the QS minimization lies in the fact that, when an observation
is successfully assimilated, the eMSE is reduced. Thus, the analysis
probability mass concentrates around the true state. The analysis is then
more likely to be in a neighborhood of the true state where the model is
linear. The cost function non-quadraticity can then be increased by adding a
new term in it. This is confirmed by the following argument: let

Unfortunately, these QS minimizations are very expensive. Indeed, they add a
third outer loop repeating

In the following, we perform numerical experiments with the Lorenz 1963 (L63)
and Lorenz 1995 (L95) models. L95 has already been presented in
Sect.

Both models are assumed perfect. The truth run is generated from a random
state space point. The initial ensemble is generated from the truth with

The IEnKS parameters are

Unlike

Figures

IEnKS (lower triangles) and
IEnKS

IEnKS (lower triangles) and
IEnKS

Figure

IEnKS-MDA (

IEnKS

Figure

In this paper, we have extended the study of

The long term impact of cycling was first theoretically investigated in a
linear context for 4D-Var and the IEnKS, then numerically for the IEnKS in a
nonlinear context. The way information is propagated between data
assimilation cycles indeed explains the
difference between
4D-Var and the IEnKS. Both reveal performance improvements with the DAW
parameter

However, it is observed that this improvement has a limit in the chaotic,
perfect model case. The cost function global minimum basin of attraction
appears to shrink with increasing

Quasi-static minimizations lead slowly but surely to the global minimum by
repeated cost function minimizations. As the DAW length

Unfortunately, this method (IEnKS

We did not focus on the applicability of the methods to high-dimensional and
imperfect models. In particular, we considered very long DAWs, which, even if
of high mathematical interest or for low-order reliable models, is less
relevant for significantly noisy models. However, we know from

No data sets were used in this article.

The objective of this Appendix is to establish a recurrence relation between the 4D-Var eMSE of each cycle. From this relation we will get an expression for the 4D-Var asymptotic eMSE.

We assume

The objective of this appendix is to establish a recurrence relation between the IEnKS eMSE of each cycle. From this relation we will get an expression for the IEnKS asymptotic eMSE.

First, it is proven by recurrence that for all

The conditional variance Eq. (

Let us now show that the IEnKS eMSE is optimal. Let

In the multivariate, diagonal case the algebra can be conducted on each direction independently. Thus, the eMSE in this case is the sum of the univariate eMSEs of each direction.

The IEnKS averaged cost function

where

The authors declare that they have no conflict of interest.

This article is part of the special issue “Numerical modeling, predictability and data assimilation in weather, ocean and climate: A special issue honoring the legacy of Anna Trevisan (1946–2016)”. It is not associated with a conference.

The authors are grateful to Selime Gürol for her comments and suggestions on the manuscript. The authors are also grateful to two reviewers, Carlos Pires and an anonymous reviewer, for raising very interesting questions and for their suggestions, and to Alberto Carrassi, acting as editor, for his remarks and suggestions. CEREA is a member of Institut Pierre-Simon Laplace (IPSL). Edited by: Alberto Carrassi Reviewed by: Carlos Pires and one anonymous referee