The ensemble Kalman filter (EnKF) is a powerful data assimilation method
meant for high-dimensional nonlinear systems. But its implementation requires
somewhat ad hoc procedures such as localization and inflation. The recently
developed

The ensemble Kalman filter (EnKF) has become a popular data
assimilation method for high-dimensional geophysical systems

However, to perform satisfyingly, the EnKF may require the use or inflation
and/or localization, depending on the data assimilation system setup.
Localization is required in the rank-deficient regime, in which the limited
size of the ensemble leads to an empirical error covariance matrix of overly
small rank, as is often the case in realistic high-dimensional systems

Inflation is a complementary technique meant to increase the variances
diagnosed by the EnKF

A variant of the EnKF, called the finite-size ensemble Kalman filter
(EnKF-N), has been introduced in

The EnKF-N is derived by assuming that the ensemble members are drawn from
the same distribution as the truth, but makes no further assumptions about
the ensemble's accuracy. In particular, the EnKF-N, unlike the traditional
EnKFs, does not make the approximation that the sample first- and
second-order moments coincide with the actual moments of the prior (which
would be accessible if the ensemble size

Through its mathematical derivation, the scheme underlines the missing
information besides the observations and the ensemble forecast, an issue that
is ignored by traditional EnKFs. This missing information is explicitly
compensated for in the EnKF-N using a so-called

The objective of this paper is to clarify several of those choices, to answer several questions raised in the above references, and to advocate the use of improved or new hyperpriors. This should add to the theoretical understanding of the EnKF, but also provide a useful algorithm. Specifically, the EnKF-N allows the development of data assimilation systems under perfect model conditions without worrying about tuning the inflation. In the whole paper, we will restrict ourselves to perfect model conditions.

In Sect.

The key ideas of the EnKF-N are presented and clarified in this section. Additional insights into the scheme and why it is successful are also given.

The EnKF-N prior accounts for the uncertainty in

Schematic of the traditional standpoint on the analysis of the EnKF (top row), what it actually does using a Gaussian prior sampled from three particles (middle row), and using a predictive prior accounting for the uncertainty due to sampling (bottom row). The full green line represents the Gaussian observation error prior pdfs, and the dashed blue lines represent the Gaussian/predictive priors if known, or estimated from an ensemble, or obtained from a marginalization over multiple potential error statistics. The dotted red curves are the resulting analysis pdfs.

Assuming one subscribes to this EnKF-N view of the EnKF, it shows that additional information is actually required in the EnKF, in addition to the observations and the prior ensemble that are potentially insufficient to make an inference.

A simple choice was made in Boc11 for the hyperprior: the Jeffreys prior is
an analytically tractable and uninformative hyperprior of the form

With a given hyperprior, the marginalization over

Using Jeffreys' hyperprior, Boc11 showed that the integral can be obtained
analytically and that the predictive prior is a multivariate T distribution:

This non-Gaussian prior distribution can be seen as an average over Gaussian
distributions weighted according to the hyperprior. It can be shown that
Eq. (

In comparison, the traditional EnKF implicitly assumes that the hyperprior is

Consider a given analysis step of the data assimilation cycle. The
observation vector is denoted

Let us recall and further discuss the analysis step of the EnKF-N for state
estimation. For the sake of simplicity, the observational error distribution
is assumed Gaussian, unbiased, with covariance matrix

For reference, with these notations, the cost function of the ensemble
transform Kalman filter

A cost function in state space could be rigorously derived from the prior
Eq. (

From a probabilistic standpoint, the logarithm of the determinant of the
Jacobian matrix should be added to the cost function since

Let us denote

Hence, we wish to fix the gauge while keeping the initial perturbations as
much as possible. To do so, the definition of the prior pdfs defined on

The use of these extended pdfs in the analysis is justified by the fact that
the Bayesian analysis pdf

As opposed to the Gaussian case, the form of pdf Eq. (

The form of the predictive prior also has important consequences for the
EnKF-N theory. First of all, the pdfs Eqs. (

Conditioned on

Hence, the analysis

One candidate Gaussian that does not involve integrating over the hyperprior
is the Laplace approximation of the posterior

Once a candidate Gaussian for the posterior has been obtained, the updated
ensemble of the EnKF-N is obtained from the Hessian, just as in the ETKF. The
updated ensemble is

Boc11 showed that the functional Eq. (

Equation (

Since the dual EnKF-N is meant to be equivalent to the primal EnKF-N when the
observation operator is linear, the updated ensemble should actually be based
on Eq. (

The co-dependence of the radial and angular degrees of freedom exposed by the
dual cost function is further explored in Appendix

We have discussed and amended the analysis step of the EnKF-N. To complete the data assimilation cycle, the ensemble must be forecasted between analyses. The cycling of the EnKF-N can be summarized by the following diagram:

In accounting for sampling error, the EnKF-N framework differs quite
significantly from that of

Assume that an ensemble square root Kalman filter is applied to linear
forecast and observation models, and further assume that the ensemble is big
enough to span the unstable and neutral subspace. In this case, it was shown
that inflation or localization are unnecessary to regularize the error
covariance matrix

A recent study by

To picture the impact of inflation on the fully cycled EnKF, let us consider
the simplest possible, one-variable, perfect, linear model

In spite of its simplicity and its linearity, this model makes the link
between the EnKF-N, multiplicative inflation and the dynamics.

Twin experiments using a perfect model and the EnKF-N have been carried out
on several low-order models in previous studies. In many cases the EnKF-N, or
its variant with localization (using domain localization), were reported to
perform on the Lorenz-63 and Lorenz-95 models as well as the ETKF but with
optimally tuned uniform inflation. With a two-dimensional forced turbulence
model, driven by the barotropic vorticity advection equation, it was found to
perform almost as well as the ETKF with optimally tuned uniform inflation

Analysis error variance when applying sequential data assimilation
to

The choice of

Average analysis RMSE for the primal EnKF-N, the dual EnKF-N, the approximate EnKF-N, and the EnKF with uniform optimally tuned inflation, applied to the Lorenz-95 model, as a function of the time step between updates. The finite-size EnKFs are based on Jeffreys' hyperprior.

Figure

The performances of the primal and dual EnKF-N are indistinguishable for the
full

Similar experiments have been conducted with the Lorenz-63 model

The additional numerical cost of using the finite-size formalism based on
Jeffreys' hyperprior is now compared to the analysis step of an ensemble
Kalman filter or of an ensemble Kalman smoother based on the
ensemble-transform formulation. The computational cost depends on the type of
method. Let us first discuss non-iterative methods, such as the ETKF or a
smoother based on the ETKF. If the singular value decomposition (SVD) of

Moreover, it is important to notice that the perturbations update as given by
Eq. (

Let us finally mention that no significant additional storage cost is required by the scheme.

The EnKF-N based on the Jeffreys hyperprior was found to fail in the limit
where the system is almost linear but remains nonlinear (BS12). This regime
is rarely explored with low-order models, but it is likely to be encountered
in less homogeneous, more realistic applications.
Figure

In this regime, the EnKF-N has great confidence in the prior as any filter
would do. Therefore, the innovation-driven term becomes less important than
the prior term

Average analysis RMSE for the EnKF-N with Jeffreys' hyperprior, with
the EnKF-N based on the Dirac–Jeffreys hyperprior, with the EnKF-N based on
the Jeffreys hyperprior but enforcing schemes R1 or R2, and the EnKF with
uniform optimally tuned inflation, applied to the Lorenz-95 model, as a
function of the time step between update (top), and as a function of the
forcing

More generally, we believe the problem is to be encountered whenever the prior largely dominates the analysis (prior-driven regime). This is bound to happen when the observations are too few and too sparsely distributed, which could occur when using domain localization, and whenever they are unreliable compared to the prior. Quasi-linear dynamics also fit this description, the ratio of the observation precision to the prior precision becoming small after a few iterations.

This failure may not be due to the EnKF-N framework. It may be due to an
inappropriate choice of candidate Gaussian posterior as described in
Sec.

Here, deflation is avoided by capping

Even with the Dirac–Jeffreys hyperprior, it is still necessary to introduce
a tiny amount of inflation through

In the limit of

The performance of the Dirac–Jeffreys EnKF-N where we choose

Another way to make a data assimilation system based on the Lorenz-95 model
more linear, rather than decreasing

The spread of the ensemble for the Dirac–Jeffreys EnKF-N has also been
plotted in Fig.

So far, the EnKF-N has relied on a noninformative hyperprior. In this section
we examine, mostly at a formal level, the possibility of accounting for
additional, possibly independent, information on the error statistics, like a
hybrid EnKF–3D-Var is meant to

In a perfect model context, we observed that uncertainty in the variances
usually addressed by inflation could be taken care of by the EnKF-N based on
Jeffreys' hyperprior. However, it does not take care of the correlation (as
opposed to variance) and rank-deficiency issues, which are usually addressed
by localization. Localization has to be superimposed onto the finite-size
scheme to build a local EnKF-N without the intrinsic need for inflation

An informative hyperprior is the normal-inverse Wishart (NIW) pdf:

The resulting predictive prior can be deduced from

A subclass of hyperpriors is obtained when the degree of freedom

Because the scale matrix

The primal analysis in state space is obtained from

For the dual analysis, we further assume that the observation operator

Recall that the square root ensemble update corresponding to
Eq. (

Assuming

Let

Here, however, we wish to obtain a similar left-transform but for the NIW
EnKF-N. The Hessian of the primal cost function Eq. (

The state space formulation of the analysis enables covariance localization,
which was not possible in ensemble space. To regularize

An alternative is to use the

Average analysis RMSE as a function of

Moreover, the above derivation suggests the following perturbation update
needed to complete the NIW EnKF-N scheme:

Here we wish to illustrate the use of the EnKF-N based on the IW hyperprior.
We consider again the same numerical setup as in Sect.

This is a preliminary experiment. In particular, we do not perform any
optimization of

In this article, we have
revisited the finite-size ensemble Kalman filter, or EnKF-N. The scheme
offers a Bayesian hierarchical framework to account for the uncertainty in
the forecast error covariance matrix of the EnKF that is inferred from a
limited-size ensemble. We have discussed, introduced additional arguments
for, and sometimes improved several of the key steps of the EnKF-N
derivation. Our main findings are the following.

A proper account of the gauge degrees of freedom in the redundant ensemble
of perturbations and the resulting analysis led to a small but important
modification of the ensemble transform-based EnKF-N analysis cost function
(

Consequently, the marginal posterior distribution of the system state is a Cauchy distribution, which is proper but does not have first- and second-order moments. Hence, only the maximum a posteriori estimator is unambiguously defined. Moreover, this suggests that the Laplace approximation should be used to estimate the full posterior.

The modification

The connection to dynamics has been clarified. It had already been assumed that the EnKF-N compensates for the nonlinear deformation of the ensemble in the forecast step. This conjecture was substantiated here by arguing that the effect of the nonlinearities is similar to sampling error, thus explaining why multiplicative inflation, and the EnKF-N in particular, can compensate for it.

The ensemble update of the dual EnKF-N was amended to offer a perfect equivalence with the primal EnKF-N. It was shown that the additional term in the posterior error covariance matrix accounts for the error co-dependence between the angular and the radial degrees of freedom. However, this correction barely affected the numerical experiments we tested it with.

The EnKF-N based on Jeffreys' hyperprior led to unsatisfying performance in the limit where the analysis is largely driven by the prior, especially in the regime where the model is almost (but not) linear. We proposed two new types of schemes that rectify the hyperprior. These schemes have been successfully tested on low-order models, meaning that the performance of the EnKF-N becomes as good as the ensemble square root Kalman filter with optimally tuned inflation in all the tested dynamical regimes.

As originally mentioned in Boc11, the EnKF-N offers a
broad framework to craft variants of the EnKF with alternative hyperpriors.
Inflation was shown to be addressed by a noninformative hyperprior, whereas a
localization seems to require an informative hyperprior. Here, we showed that
choosing the informative normal-inverse Wishart distribution as a hyperprior
for

With the corrections and new interpretations on the EnKF-N based on Jeffreys' hyperprior, we have obtained a practical and robust tool that can be used in perfect model EnKF experiments in a wide range of conditions without the burden of tuning the multiplicative inflation. This has saved us a lot of computational time in recent published methodological studies.

An EnKF-N based on an informative hyperprior, the normal-inverse Wishart
distribution, has been described and its equations derived. We plan to
evaluate it thoroughly in extensive numerical experiments. Several optional
uses of the method are contemplated. Hyperparameters

The EnKF-N is not designed to handle model error, which is critical for realistic applications. Other adaptive inflation techniques currently in operation would be more robust in such contexts. We are working on a consistent merging of the finite-size approach that accounts for sampling errors and of a multiplicative inflation scheme designed to account for model error.

Section

Here we wish to interpret the contributions in the Hessian
Eq. (

We are grateful to two anonymous reviewers and to the editor, Zoltan Toth, for their valuable and helpful suggestions to improve this paper. This study is a contribution to INSU/LEFE project DAVE. Edited by: Z. Toth Reviewed by: two anonymous referees