NPGNonlinear Processes in GeophysicsNPGNonlin. Processes Geophys.1607-7946Copernicus PublicationsGöttingen, Germany10.5194/npg-23-59-2016Hybrid Levenberg–Marquardt and weak-constraint ensemble Kalman smoother methodMandelJ.jan.mandel@gmail.comhttps://orcid.org/0000-0002-8489-5766BergouE.GürolS.GrattonS.KasanickýI.University of Colorado Denver, Denver, CO 80217-3364, USAINRA, MaIAGE, Université Paris-Saclay, 78350 Jouy-en-Josas, FranceCERFACS, 31100 Toulouse, FranceINP-ENSEEIHT, 31071 Toulouse, FranceInstitute of Computer Science, The Czech Academy of Sciences, 182 07 Prague, Czech RepublicJ. Mandel (jan.mandel@gmail.com)11March2016232597321April201526May20151February20164February2016This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/This article is available from https://npg.copernicus.org/articles/23/59/2016/npg-23-59-2016.htmlThe full text article is available as a PDF file from https://npg.copernicus.org/articles/23/59/2016/npg-23-59-2016.pdf
The ensemble Kalman smoother (EnKS) is used as a linear least-squares solver
in the Gauss–Newton method for the large nonlinear least-squares system in
incremental 4DVAR. The ensemble approach is naturally parallel over the
ensemble members and no tangent or adjoint operators are needed. Furthermore,
adding a regularization term results in replacing the Gauss–Newton method,
which may diverge, by the Levenberg–Marquardt method, which is known to be
convergent. The regularization is implemented efficiently as an additional
observation in the EnKS. The method is illustrated on the Lorenz 63 model and
a two-level quasi-geostrophic model.
Introduction
Four-dimensional variational data assimilation (4DVAR) is
a dominant data assimilation method used in weather forecasting centers
worldwide. 4DVAR attempts to reconcile model and data variationally, by
solving a large weighted nonlinear least-squares problem. The unknown is
a vector of system states over discrete points in time, when the data are
given. The objective function minimized is the sum of the squares of the
differences of the initial state from a known background state at the initial
time and the differences of the values of the observation operator and the
data at every given time point. In the weak-constraint 4DVAR
, considered here, the model error is accounted for
by allowing the ending and starting states of the model at every given time
point to be different, and also adding to the objective function the sums of
the squares of those differences. The sums of the squares are weighted by the
inverses of the appropriate error covariance matrices, and much of the work
in the applications of 4DVAR goes into modeling those covariance matrices.
In the incremental approach , the nonlinear
least-squares problem is solved iteratively by solving a succession of
linearized least-squares problems. The major cost in 4DVAR iterations is
evaluating the model, the tangent and adjoint operators, and solving the
large linear least squares. A significant software development effort is
needed for the additional code to implement the tangent and adjoint operators
to the model and the observation operators. Straightforward linearization
leads to the Gauss–Newton method for nonlinear least squares
. Gauss–Newton iterations are not
guaranteed to converge, not even locally, though a careful design of an
application system may avoid divergence in practice. Finally, while the
evaluation of the model operator is typically parallelized on modern computer
architectures, there is a need to further parallelize the 4DVAR process
itself.
The Kalman filter is a sequential Bayesian estimation of the Gaussian state
of a linear system at a sequence of discrete time points. At each of the time
points, the use of the Bayes theorem results in an update of the state,
represented by its mean and covariance. The Kalman smoother considers all
states within an assimilation time window to be a large composite state.
Consequently, the Kalman smoother can be obtained from the Kalman filter by
simply applying the same update as in the filter to the past states as well.
However, historically, the focus was on efficient short recursions
, similarly as in the Kalman filter.
It is well known that weak-constraint 4DVAR is equivalent to the Kalman
smoother in the linear case and when all observations are in the assimilation
window. Use of the Kalman smoother to solve the linear least squares in the
Gauss–Newton method is known as the iterated Kalman smoother, and
considerable improvements can be obtained against running the Kalman smoother
only once .
The Kalman filter and smoother require maintenance of the covariance of the
state, which is not feasible for large systems, such as in numerical weather
prediction. Hence, the ensemble Kalman filter (EnKF) and ensemble Kalman
smoother (EnKS) use a Monte
Carlo approach for large systems, representing the state by an
ensemble of simulations and estimating the state covariance from the
ensemble. The implementation of the EnKS in uses the
adjoint model explicitly, with the short recursions and a forward and
backward pass, as in the Kalman smoother. However, the implementations in
and do not depend on the
adjoint model and simply apply EnKF algorithms to the composite state over
multiple time points. Such composite variables are also called 4-D vectors
e.g.,. We use the latter approach in the
computations reported here.
In this paper, we use the EnKS as a linear least-squares solver in 4DVAR. The
EnKS is implemented in the physical space and with randomization. The
ensemble approach is naturally parallel over the ensemble members. The rest
of the computational work is relatively cheap compared to the ensemble of
simulations, and parallel dense linear algebra libraries can be used;
however, in high-dimensional systems or for a large lag, the storage
requirements can be prohibitive e.g.,. The proposed
approach uses finite differences from the ensemble, and no tangent or adjoint
operators are needed. To stabilize the method and ensure convergence,
a Tikhonov regularization term is added to the linear least squares, and the
Gauss–Newton method becomes the Levenberg–Marquardt method
. The Tikhonov regularization is
implemented within EnKS as an independent observation following
in a computationally cheap additional analysis step,
which is statistically correct because the smoother operates only on the
linearized problem. A new probabilistic ensemble is generated in every
iteration, so the minimization is not restricted to the combinations of a
single ensemble. We use finite differences from the ensemble mean towards the
ensemble members to linearize the model and observation operators. The
iterations can be proven to converge to incremental 4DVAR iterations for
small finite difference steps and large ensemble sizes
. Thus, in the limit, the method performs actual
minimization of the weak-constraint objective function and inherits the
advantages of 4DVAR in handling nonlinear problems. We call the resulting
method EnKS-4DVAR.
Combinations of ensemble and variational approaches have been of considerable
recent interest. Estimating the background covariance for 4DVAR from an
ensemble was one of the first connections . It is
now standard and became operational .
use a two-way connection between EnKF and 4DVAR to
obtain the covariance for 4DVAR, and 4DVAR to feed the mean analysis into
EnKF. EnKF is operational at the National Centers for Environmental
Prediction (NCEP) as part of its Global Forecast System Hybrid Variational
Ensemble Data Assimilation System (GDAS), together with the Gridpoint
Statistical Interpolation (GSI) variational data assimilation system
.
The first methods that use ensembles for more than computing the covariance
minimized the 3DVAR objective function in the analysis step.
The maximum likelihood ensemble filter (MLEF) method by
works in the ensemble space, i.e., minimizing in the
span of the ensemble members, with the control variables being the
coefficients of a linear combination of the ensemble members.
use an iterated ensemble Kalman filter (with
randomization) in the state space, with a linearization of the observation
operator obtained by a regression on the increments given by the ensemble.
This approach was extended by to a Levenberg–Marquardt
method, with the regularization done by a multiplicative inflation of the
covariance in the linearized problem rather than adding a Tikhonov
regularization term. and
minimize the (strong-constraint) 4DVAR objective
function over linear combinations of the ensemble by computations in the
observation space.
The iterated ensemble Kalman filter by , called IEnKF,
minimizes the lag-one 4DVAR objective function in the
ensemble space, using the square root EnKF as a linear solver in the
Newton–Gauss method, and rescaling the ensemble to approximate the tangent
operators, which is similar to the use of finite differences and EnKS here.
combined the IEnKF method of
with an inflation-free approach to obtain a 4-D ensemble variational method,
and with the Levenberg–Marquardt method by adding a diagonal regularization
to the Hessian. and used
Levenberg–Marquardt for faster convergence, as an adaptive method between
the steepest descent and the Gauss–Newton method rather than to overcome
divergence. also considered scaling the ensemble to
approximate the tangent operators (“bundle variant”) as in
. extended IEnKF to a smoother
(IEnKS) with fixed lag and moving window and noted that Gauss–Newton can be
replaced by Levenberg–Marquardt. The method is formulated in terms of the
composite model operator, i.e., with strong constraints.
developed the method further, including cycling.
note that various
optimizers could be used in IEnKF/IEnKS; the present method can be understood
as EnKS used as such an optimizer.
It is well known that for good practical performance, ensemble methods need
to be modified by localization to improve the sampling error. Ensemble
methods can be localized in multiple ways . For methods
operating in the physical space, localization can be achieved, e.g., by
tapering of the covariance matrix or by replacing the
sample covariance by its diagonal in a spectral space
. This is not completely straightforward for the
EnKS, but implementations of the EnKS based on the Bryson–Frazier version of
the classical formulation of the Kalman smoother, with a forward and backward
pass, are more flexible . Methods in the ensemble
space can be modified to update only nodes in a neighborhood of the
observation e.g.,. The 4DEnVAR method of
uses ensemble-derived background covariance, and
the authors propose several methods to solve the linearized problem in each
iteration by combinations of ensemble members with the weights allowed to
vary spatially. compare the hybrid 4DEnVAR and hybrid
4DVAR for operational weather forecasts. “Hybrid” refers to a combination
of a fixed climatological model of the background error covariances and
localized covariances obtained from
ensembles.
The paper is organized as follows. In Sect. , we review the
formulation of 4DVAR. The EnKF and the EnKS are reviewed in
Sect. . The proposed method is described in Sect. .
Section contains the results of the computational experiments,
and Sect. is the conclusion.
Incremental 4DVAR
For vectors ui, i=1,…,L, denote the composite (column) 4-D
vector
u0:L=u0⋮uL,
where L is the number of cycles in the assimilation window. We want to
estimate x0,…,xL, where xi is the state
at time i, from the background state,
x0≈xb, the model,
xi≈Mixi-1, and the
observations, Hixi≈yi,
where Mi is the model operator and Hi is the
observation operator. Quantifying the uncertainty by covariances, with
x0≈xb taken as x0-xbTB-1x0-xb≈0, etc., we get the
nonlinear least-squares problem
x0-xbB-12+∑i=1Lxi-Mixi-1Qi-12+∑i=1Lyi-HixiRi-12→minx0:L,
called weak-constraint 4DVAR . Originally, in 4DVAR,
xi=Mixi-1; the weak-constraint
xi≈Mixi-1 accounts for
model error.
The least-squares problem (Eq. ) is solved iteratively by
linearization,
Mixi-1+δxi-1≈Mixi-1+Mi′xi-1δxi-1,Hixi+δxi≈Hixi+Hi′xiδxi.
In each iteration x0:L←x0:L+δx0:L, one solves the auxiliary linear least-squares problem for the
increments δx0:L,
x0+δx0-xbB-12+∑i=1Lxi+δxi-Mixi-1+Mi′xi-1δxi-1Qi-12+∑i=1Lyi-Hixi+Hi′xiδxiRi-12→minδx0:L.
This is the Gauss–Newton method for
nonlinear least squares, known in 4DVAR as the incremental approach
. Write the auxiliary linear least-squares problem
(Eq. ) for δx0:L as
δx0-δxbB-12+∑i=1Lδxi-Miδxi-1+miQi-12+∑i=1Ldi-HiδxiRi-12→minδx0:L
where
δxb=xb-x0,mi=Mixi-1-xi,di=yi-Hixi,Mi=Mi′xi-1,Hi=Hi′xi.
The function minimized in Eq. () is the same as the one
minimized in the Kalman smoother .
Ensemble Kalman filter and smoother
We present the EnKF and EnKS algorithms, essentially following
, in a form suitable for our purposes. We start with
a formulation of the EnKF, in a notation useful for the extension to EnKS.
The notation vℓ∼Nm,A means
that vℓ is sampled from the Gaussian distribution
Nm,A with mean m and covariance
A, independently of anything else. The ensemble of states of the
linearized model at time i, conditioned on data up to time j (that is,
with the data up to time j already ingested), is denoted by
Xi|jN=xi|j1,…,xi|jN=xi|jℓ, where the ensemble member index ℓ always
runs over ℓ=1,…,N, and similarly for other ensembles. Assume for
the moment that the observation operator Hi is linear; that
is, Hiu=Hiu. The
EnKF algorithm consists of the following steps.
Initializex0|0ℓ∼Nxb,B,ℓ=1,…,N.
For i=1,2,…,L,
advance in time:xi|i-1ℓ=Mi(xi-1|i-1ℓ)+viℓ,viℓ∼N0,Qi.
The analysis step isxi|iℓ=xi|i-1ℓ-Pi,iNHiT(HiPi,iNHiT+Ri)-1(Hi(xi|i-1ℓ)-di-wiℓ),wiℓ∼N0,Ri,where Pi,iN is the sample covariance computed from the
ensemble Xi|i-1N.
Denote by AiN the matrix of anomalies of the ensemble
Xi|i-1N,
AiN=ai1,…,aiN=xi|i-11-x‾i|i-1,…,xi|i-1N-x‾i|i-1,x‾i|i-1=1N∑j=1Nxi|i-1j.
Then
Pi,iN=1N-1AiNAiNT,
and we can write the matrices in Eq. () as
Pi,iNHiT=1N-1AiNHiAiNT and HiPi,iNHiT=1N-1HiAiNHiAiNT.
In particular, the matrix Hi is used here only in the
matrix–vector multiplications
giℓ=Hiaiℓ=Hixi|i-1ℓ-x‾i|i-1=Hixi|i-1ℓ-1N∑j=1NHixi|i-1j,
which allows the matrix–vector multiplication to be replaced by the use of
a possibly nonlinear observation operator Hi evaluated on the
ensemble members only (Eq. below). This technique is
commonly used for nonlinear observation operators. With
HiAiN=GiN=gi1,…,giN, Eq. ()
becomes
Pi,iNHiT=1N-1AiNGiNT,HiPi,iNHiT=1N-1GiNGiNT.
Also, from Eqs. () and () and writing the
matrix of anomalies in the form
AiN=Xi|jNI-11TN,
where 1 is the column vector of all ones of length N; it
follows that the analysis ensemble Xi|iN consists of linear
combinations of the forecast ensemble. Hence, it can be written as
multiplying the forecast ensemble by a suitable transformation matrix
TiN,
Xi|iN=Xi|i-1NTiN,TiN∈RN×N,
where
TiN=I-1N-1I-11TNAiNT1N-1AiNAiNT+Ri-1⋅Hixi|i-1ℓ-di+wiℓℓ=1,N.
The EnKS is obtained by applying the same analysis step as in EnKF
(Eq. ) to the ensemble X0:i|i-1 of 4-D
composite states from time 0 to i, conditioned on data up to time i-1,
X0:i|i-1N=X0|i-1N⋮Xi|i-1N,
in the place of Xi|i-1, with the observation matrix
H̃0:i=0,…,Hi. Then,
Eq. () becomes
x0:i|iℓ=x0:i|i-1ℓ-P0:i,0:iNH̃0:iT(H̃0:iP0:i,0:iH̃0:iT+Ri)-1(H̃0:ix0:i|i-1ℓ-diℓ-wiℓ),
where P0:i,0:iN is the sample covariance matrix of
X0:i|i-1N. Fortunately, the matrix–vector and
matrix–matrix products can be simplified:
H̃0:ix0:i|i-1ℓ=0,…,0,Hix0:i|i-1ℓ=Hixi|i-1ℓ,P0:i,0:iNH̃0:iT=P0:i,iNHiT,H̃0:iP0:i,0:iNH̃0:iT=HiPi,iNHiT,
which is the same expression as in Eq. (). Also using
Eq. (), we obtain the EnKS algorithm.
Initialize:
x0|0ℓ∼Nxb,B,ℓ=1,…,N.
For i=1,…,L:
advance in time:xi|i-1ℓ=Mi(xi-1|i-1ℓ)+viℓ,viℓ∼N0,Qi,ℓ=1,…,N.
Compute the anomalies of the ensemble in the state space and in the
observation space.A0:i=a0:i1,…,a0:iN,a0:iℓ=x0:i|i-1ℓ-1N∑j=1Nx0:i|i-1jGiN=gi1,…,giN,giℓ=Hixi|i-1ℓ-1N∑j=1NHixi|i-1j
The analysis step:x0:i|iℓ=x0:i|i-1ℓ-1N-1A0:iNGiNT1N-1GiNGiNT+Ri-1⋅Hi(xi|i-1ℓ)-yi-wiℓ,wiℓ∼N0,Ri,ℓ=1,…,N.
Comparing Eqs. () and (), we see
that the EnKS can be implemented in a straightforward manner by applying the
same transformation as in the EnKF to the composite 4-D state vector from
times 0 to i, X0:i|iN=X0:i|i-1NTiN, where TiN is the transformation matrix
in Eq. () Eq. 20.
EnKS-4DVAR
We apply the EnKS algorithm (Eqs. –)
with the increments δx in place of
x to solve the linearized auxiliary least-squares problem
(Eq. ). Approximating by finite differences based at
xi-1 with step τ>0, we get the action of the linearized model
operator
Miδxi-1ℓ+mi≈Mixi-1+τδxi-1ℓ-Mixi-1τ+Mixi-1-xi
and the linearized observation operator
Hiδxiℓ≈Hixi+τδxiℓ-Hixiτ.
The Gauss–Newton method may diverge, but convergence to a stationary point
of Eq. () can be recovered by a control of the step
δx. Adding a constraint of the form δxi≤ε leads to globally convergent trust region
methods . Here, we add to Eq. ()
a Tikhonov regularization term of the form γδxiSi-12, which controls the step size as
well as rotates the step direction towards the steepest descent, and obtain
the Levenberg–Marquardt method x0:L←x0:L+δx0:L, where
δx0-δxbB-12+∑i=1Lδxi-Miδxi-1-miQi-12+∑i=1Ldi-HiδxiRi-12+γ∑i=0LδxiSi-12→minδx0:L.
Under suitable technical assumptions, the Levenberg–Marquardt method is
guaranteed to converge globally if the regularization parameter γ≥0
is large enough . Estimates for the
convergence of the Levenberg–Marquardt method in the case when the linear
system is solved only approximately exist .
Similarly as in , we interpret the regularization term
γδxiSi-12
in Eq. () as arising from additional independent observations
δxi≈0 with covariance
γ-1Si. The independent observation can be assimilated
separately, resulting in a mathematically equivalent but often more efficient
two-stage method – simply run the EnKF analysis twice. With the choice of
Si as an identity or, more generally, a diagonal matrix, the
implementation of these large observations can be made efficient
. We use the notation δx0:i|i-1/2ℓ for the increments after the first half-step,
conditioned on the original observations only, and δx0:i|iℓ for the increments conditioned also on the regularization
δxi≈0. Note that, unlike in
, where the regularization was applied to a nonlinear
problem and thus the sequential data assimilation was only approximate, here
the EnKS is run on the auxiliary linearized problem, so all distributions are
Gaussian and the equivalence of assimilating the observations at the same
time and sequentially is statistically exact.
We obtain the following algorithm EnKS-4DVAR for
Eq. ().
Initialize
x0=xb,xi=Mixi-1,i=1,…,L,
if not given already.
Incremental 4DVAR (Eq. ): given x0,…,xL, initialize the ensemble of increments
δx0|0ℓ∼N0,B,ℓ=1,…,N.
For i=1,…,L:
Advance the ensemble of increments δxℓ in time
following Eq. (), with the linearized operator
approximated from Eq. (),δxi|i-1ℓ=Mixi-1+τδxi-1|i-1ℓ-Mixi-1τ+Mixi-1-xi+viℓ,viℓ∼N0,Qi,ℓ=1,…,N.
Compute the anomalies of the ensemble in the 4-D state space and in the
observation space.A0:i=a0:i1,…,a0:iN,a0:iℓ=δxi|i-1ℓ-1N∑j=1Nδxi|i-1jGiN=gi1,…,giN,giℓ=1τHixi+τδxi|i-1ℓ-1N∑j=1NHi(xi+τδxi|i-1j)
The first analysis step:δx0:i|i-1/2ℓ=δx0:i|i-1ℓ-1N-1A0:iNGiNT1N-1GiNGiNT+Ri-1⋅Hi(xi)+Hixi+τδxi|i-1ℓ-Hixiτ-yi-wiℓ,wiℓ∼N0,Ri,ℓ=1,…,N.
If γ>0, compute the anomalies of the ensemble in the 4-D state
space:Z0:iN=z0:i1,…,z0:iN,z0:iℓ=δxi|i-1/2ℓ-1N∑j=1Nδxi|i-1/2j.The observation operator for the regularization is the identity, so the
anomalies in the observation space are simply ZiN.
If γ>0, regularization as the second analysis step with zero data
and data covariance γ-1Si:δx0:i|iℓ=δx0:i|i-1/2ℓ-1N-1Z0:iNZiNT1N-1ZiNZiNT+1γSi-1⋅δxi|i-1/2ℓ-viℓ,viℓ∼N0,Si,ℓ=1,…,N;otherwise, δx0:i|iℓ=δx0:i|i-1/2ℓ,
ℓ=1,…,N.
Complete the approximate incremental 4DVAR iteration: update
x0:L←x0:L+1N∑ℓ=1Nδx0:L|Lℓ.
Note that for small γ→0, Eq. ()
has asymptotically no effect: δx0:i|iℓ→δx0:i|i-1/2ℓ. The computational cost of EnKS-4DVAR is
one evaluation of the model Mi for the initialization, N+1
evaluations of the model Mi, and N evaluations of the
observation operator Hi in each incremental 4DVAR iteration,
in each of the L observation periods. In comparison, the cost of EnKF is
N evaluation of the model Mi and of the observation operator
Hi in each observation period. Running the model and
evaluating the observation operator are the major costs in practical problems
such as weather models, rather than the linear algebra of the EnKS itself, in
a reasonably efficient EnKF/EnKS implementation.
It can be proven that for small τ and large N, the iterates
x0:L converge to those of incremental 4DVAR
. Surprisingly, it turns out that in the case when
τ=1, we recover the standard EnKS applied directly to the nonlinear
problem (Eq. ), as shown by the following theorem. In
particular, EnKS-4DVAR does not converge when τ=1 for nonlinear
problems, because the result of each iteration is determined only by the
starting value x0. It is interesting that the ensemble transform
approach in and
corresponds to our
τ=1, but it does not seem to reduce to the standard EnKS.
Theorem 1. If τ=1, then one step of EnKS-4DVAR
(Eqs. –) becomes the
EnKS (Eqs. –) (modified by
including the additional regularization observation if γ>0). In
particular, in that case, the values of x0:L+δx0:Lℓ do not depend on the previous values of x1:L.
Proof: indeed, Eq. () becomes
δxi|i-1ℓ=Mixi-1+δxi-1|i-1ℓ-Mixi-11+Mixi-1-xi+viℓ=Mixi-1+δxi-1|i-1ℓ-xi+viℓ;
hence,
xi+δxi|i-1ℓ=Mixi-1+δxi-1|i-1ℓ+viℓ,
which is the same as Eq. () for xi-1+δxi-1|i-1ℓ in place of xi-1|i-1. Similarly,
Eq. () becomes, with τ=1,
giℓ=Hixi+δxi|i-1ℓ-Hixi1-1N∑j=1NHixi+δxi|i-1j-Hixi1=Hixi+δxi|i-1ℓ-1N∑j=1NHixi+δxi|i-1j,
which is again the same as Eq. () for xi+δxi|i-1ℓ in place of xi|i-1. Finally, the
innovation term in Eq. () becomes, using
Eq. (),
Hi(xi)+Hixi+δxi|i-1ℓ-Hixi1-yi=Hixi+δxi|i-1ℓ-yi,
which is again the same as in Eq. (), with xi+δxi|i-1ℓ in place of
xi|i-1. □
Computational results
In this section, we investigate the performance of the EnKS-4DVAR method,
described in this paper, by solving the nonlinear least-squares problem
(Eq. ) in which the dynamical models are chosen either by the
Lorenz 63 system or the two-level quasi-geostrophic
model . Most of the experiments assess the convergence
of the incremental 4DVAR iterations, with EnKS as the linear solver in a
single assimilation cycle (Sects. ,
). We also demonstrate the overall long-term performance
on a large number of assimilation cycles on the Lorenz 63 model in
Sect. .
We first consider experiments where the regularization is not necessary to
guarantee the convergence (i.e., γ=0). Lorenz 63 equations are used as
a forecast model for these experiments. Section
describes the Lorenz 63 model and presents numerical results on the
convergence. Using the same model, in Sect. , we
investigate the impact of the finite differences parameter τ, used to
approximate the derivatives of the model and observation operators, along the
iterations.
Experiments where the regularization is necessary to guarantee the
convergence are shown in Sect. , and we
analyze the impact of the regularization
parameter γ on the application to the two-level quasi-geostrophic
model.
Note that for the experiments presented here, we do not use localization;
hence, we choose large ensemble sizes. In all experiments, the regularization
covariance Si=I.
Numerical tests using the Lorenz 63 model
The Lorenz 63 equations are given by the nonlinear
system
dxdt=-σ(x-y),dydt=ρx-y-xz,dzdt=xy-βz,
where x=x(t), y=y(t), z=z(t) and σ, ρ, β are
parameters whose values are chosen as 10, 28, and
8/3, respectively, for the experiments described in this paper. These
values result in a chaotic behavior with
two regimes as illustrated in Fig. . This figure shows
the Lorenz attractor, which has two lobes connected near the origin, and the
trajectories of the system in this saddle region are particularly sensitive
to perturbations. Hence, slight perturbations can alter the subsequent path
from one lobe to the other.
The state at time t is denoted by Xt=[x(t),y(t),z(t)]⊤,
Xt∈R3.
The
Lorenz attractor; initial values x(0)=1, y(0)=1, and z(0)=1.
To evaluate the performance of the EnKS-4DVAR method, we will test it using
the classical twin experiment technique, which consists in fixing an initial
true state, denoted by truth0, and then integrating the
initial truth in time using the model to obtain the true state
truthi=M(truthi-1) at each cycle
i. We then build the data yi by applying the observation
operator Hi to the truth at time i and by adding a Gaussian
perturbation N(0,Ri). Similarly, the background xb
is sampled from the Gaussian distribution with the mean
truth0 and the covariance matrix B. Then, we try
to recover the truth using the observations and the background.
Convergence of the iterations
We perform numerical experiments without model
error. The initial truth is set to truth0=[1,1,1]⊤
and the background covariance is chosen as the identity matrix of order 3,
i.e., B=I3. The model is advanced in cycles of 0.1
time unit. Within each cycle, the differential equations are solved by the
adaptive Runge–Kutta method implemented as MATLAB function ode45,
with default parameter values. The assimilation time window length is L=50
cycles (5 time unit total). The observation operator is defined as
Hix,y,z=x2,y2,z2. At
each time i, the observations are constructed as follows: yi=Hi(truthi)+vi, where
vi is sampled from N(0,R) with
R=I3. Observations are taken for each cycle
(i=1,…50). The ensemble size is fixed to N=100.
Root
square error given by Eq. () for the first five Gauss–Newton
iterations from the Lorenz 63 problem. The initial conditions for the truth
are x(0)=1, y(0)=1, and z(0)=1. The cycle length is dt= 0.1
time unit. The observations are the full state at each time step. The
ensemble size is N=100. The assimilation window length is L=50 cycles.
The finite
differences parameter is τ=10-3.
Box plots of objective function
values for the Lorenz 63 problem. From the left to the right and from the top
to the bottom, the figures correspond to the results of the first, second,
third and fourth iterations, respectively. The whole state is observed.
Ensemble size is 50. The assimilation window is 50 cycles. In each box,
the central line presents the median (red line), the edges are the 25th and
75th percentiles (blue line), the whiskers extend to the most extreme data
points the plot algorithm considers not to be outliers (black line), and the
outliers are plotted individually (red dots).
Same as
Fig. , but for the fifth, sixth, seventh and
eighth iteration, respectively.
Figure shows the root square error (RSE) for
the first five iterations, defined as
RSEi(j)=1n(truthi-xi(j))⊤(truthi-xi(j)),j=1,…,5,
where truthi is the true vector state at time i,
xi(j) is the jth iterate at time i, and n is the length
of xi. Table shows the root mean
square error (RMSE) for each iterate given by
RMSE(j)=1L∑i=0LRSEi(j),j=1,…,5.
From Table and
Fig. , it can be seen that the iterates converge
to the solution, without using regularization. For these experiments, we
observe that RMSE is reduced significantly in five iterations. Note that the
error does not converge to zero because of the approximation and variability
inherent in the ensemble approach.
The root mean square error given by
Eq. () for the first six Gauss–Newton iterations, for the
Lorenz 63 problem. The whole state is observed. Ensemble size is 100. The
assimilation window length is 50 cycles. The finite differences parameter
is 10-3.
Iteration123456RMSE20.1615.373.732.530.090.09
Mean of the objective
function from 30 runs of the EnKS-4DVAR algorithm for the Lorenz 63 problem
and for different values of τ (finite differences parameter). The whole
state is observed. Ensemble size is 50. The assimilation window length is
50 cycles.
Now we investigate the influence of the finite differences parameter τ
used to approximate the derivatives of the model and observation operators.
We use the same experimental setup as described in the previous section. The
numerical results are based on 30 runs with eight iterations for the Lorenz
63 problem, with the following choices for the parameter τ: 1,
10-1, 10-2, 10-3, 10-4, 10-5, and 10-6.
Table shows the mean of the objective
function value as a function of the finite difference step τ and the
number of iterations. When τ=1, the iterations after the first one do
not improve the objective function. However, when τ≤10-1, the
objective function was overall decreasing along the iterations after a large
initial increase. Because of the stochastic nature of the algorithm, the
objective function does not necessarily decrease every iteration, and its
values eventually fluctuate around a limit value randomly. This stage was
achieved after at most six iterations, so only eight iterations are shown;
further lines (not shown) exhibit the same fluctuating pattern in all
columns. This limit value of the objective function decreases with smaller
τ until it stabilizes for τ≤10-3.
Figures and
show more details of the statistics as box plots of the objective function
values. Each panel corresponds to one line of
Table .
We can conclude that, for this toy test case at least, the method was
insensitive to the choice of τ≤10-3. This is a similar conclusion
as in ; the parameter τ here plays the same role
as their ε. It should be noted that very small τ, when the
problem solved by the smoother is essentially the tangent problem, results in
a large increase in the value of the objective function in the first
iteration. This is not uncommon in Newton-type methods and highly nonlinear
problems. Hence, an adaptive method, which decreases τ adaptively, may
be of interest. This issue will be studied elsewhere.
Cycling
So far, we have studied the impact of the use of the stochastic solver for a
single assimilation window only. Now we test the overall long-term
performance. Consider again the Lorenz 63 model (Eq. ), with
the parameters σ=10,ρ=28,β=8/3. This time, we use the
Runge–Kutta method of order 4 with the time step of 0.01 time unit. This
is the same parameter setup as the one used in . We
then proceed with similar testing as in . We perform
the usual twin model experiment. The initial truth state Y0 is generated
from N(0,I3) distribution and the initial forecast state is
then simulated by sampling from N(Y0,I3). Both states are
advanced for a 50 000 model time steps burn-in period. We use the nonlinear
observational operator hx,y,z=x3,y3,z3 with observational error generated from
N0,σ2I3 with σ2=8 and
τ=10-4. The cycle length Δt between the two available
observations varies from 0.05 time unit, when the model is nearly linear,
to 0.55 time unit, when the model is strongly nonlinear. We use ensemble
size 10. After running multiple simulations, we have found suitable values
of the parameters of the method as the number of iterations 25 and the
penalty coefficient γ=10-9 when Δt=0.05 and γ=1000
otherwise. The length of the assimilation window is L=6, i.e., assimilating
six observation vectors at once. Each observation vector is assimilated only
once; i.e., the assimilation windows do not overlap. To create the initial
ensemble at the beginning of each iteration, we use the background covariance
created as a weighted average of the sample covariance from the last
iteration in the previous assimilation window and the identity matrix,
similarly as in . The weights are 0.99 for the sample
covariance and 0.01 for the identity. The model error covariance in each
cycle is Q=0.01I3. The experiment was run for 100 000
observation cycles.
We also compare the proposed method with the standard EnKF with ensemble size
10, where the initial ensemble is created after the burn-in period by
adding perturbations sampled from N0,I3. For
stability reasons and to preserve the covariance between ensemble members, we
add noise sampled from N0,0.01I3 after advancing
the ensemble. The necessity of related covariance inflation was also pointed
out in . The EnKF algorithm is run every time when new
observations are available.
Figure shows that the proposed method has a
significantly smaller RMSE than the EnKF in the case when the time between
observation is larger and thus the behavior of the model is nonlinear. Only
in the case when the cycle length between the observation is 0.05 time
unit, i.e., the model behavior is nearly linear, does EnKF give a result
comparable to the proposed method.
Numerical tests using a two-layer quasi geostrophic model (QG)
The EnKS-4DVAR algorithm has been implemented in the Object Oriented
Prediction System (OOPS) , which is a data
assimilation framework developed by the European Centre for Medium-Range
Weather Forecasts (ECMWF). Numerical experiments are performed by using the
simple two-layer quasi-geostrophic model in the OOPS platform. Numerical
experiments are performed to solve the weak-constraint data assimilation
problem (Eq. ) by using EnKS-4DVAR with regularization.
Numerical results are presented in Sect. .
Comparison of the RMSE
between EnKF and EnKS-4DVAR from the twin experiment for the Lorenz 63 model.
EnKS-4DVAR has better performance for the larger time interval between the
observations as the model become more nonlinear. See Sect.
for further details.
A two-layer quasi-geostrophic model
The two-layer quasi-geostrophic channel model is widely used in theoretical
atmospheric studies, since it is simple enough for numerical calculations and
it adequately captures an important aspect of large-scale dynamics in the
atmosphere.
The two-layer quasi-geostrophic model equations are based on the
non-dimensional quasi-geostrophic potential vorticity, whose evolution
represents large-scale circulations of the atmosphere. The quasi-geostrophic
potential vorticity on the first (upper) and second (lower) layers can be
written, respectively, as
q1=∇2ψ1-f02L2g′H1(ψ1-ψ2)+βy,q2=∇2ψ2-f02L2g′H2(ψ2-ψ1)+βy+Rs,
where ψ1 and ψ2 are the stream functions, ∇2 is the
2-D Laplacian,
Rs represents orography or heating, β is the (non-dimensionalized)
northward variation of the Coriolis parameter at the fixed latitude y, and
f0 is the Coriolis parameter at the southern boundary of the domain. L
is the typical length scale of the motion we wish to describe, H1 and
H2 are the depths of the two layers, g′=gΔθ/θ‾ is the reduced gravity where θ‾ is the mean potential temperature, and Δθ is the
difference in potential temperature across the layer interface. The
non-dimensional equations can be
derived as follows:
t=t̃U¯L,x=x̃L,y=ỹL,u=ũU¯,v=ṽU¯,β=β0L2U¯,
where t denotes time, U¯ is a typical velocity scale, x and y
are the eastward and northward coordinates, respectively, u and v are the
horizontal velocity components, β0 is the northward derivative, and
the tilde notation refers to the dimensionalized parameters.
Potential vorticity in each layer is conserved and thus is described by
DiqiDt=0,i=1,2,
where Di/Dt is the total derivative, defined by
DiDt=∂∂t+ui∂∂x+vi∂∂y,
and
ui=-∂ψi∂y and vi=∂ψi∂x
are the horizontal velocity components in each layer. Therefore, the
potential vorticity at each time step is determined by using the conservation
of potential vorticity given by Eq. (). In this process, time
stepping consists of a simple first-order semi-Lagrangian advection of
potential vorticity.
Given the potential vorticity at a fixed time, Eq. () can be solved
for the stream function at each grid point and then the velocity fields
obtained through Eq. (). The equations are solved by
using periodic boundary conditions in the west–east direction and the
Dirichlet boundary condition in the north–south direction. For the
experiments in this paper, we choose L=106m, U¯=10ms-1, H1=6000m, H2=4000m,
f0=10-4s-1, and β0=1.5×10-11s-1m-1. For more details on the model
and its solution, we refer to .
The domain for the experiments is 12 000 km by 6300km for
both layers. The horizontal discretization consists of 40×20 points,
so that the east–west and north–south resolution is approximately
300km. The dimension of the state vector of the model is then
1600. Note that the state vector is defined only in terms of the stream
function.
Experimental setup
The performance of EnKS-4DVAR with regularization is analyzed by using twin
experiments (Sect. ).
The truth is generated from a model with layer depths of
D1=6000m and D2=4000m, and the time step is set
to 300s, whereas the assimilating model has layer depths of
D1=5500m and D2=4500m, and the time step is set to
3600s. These differences in the layer depths and the time step
provide a source of model error.
RMSE values calculated by
Eq. () along the incremental 4DVAR and EnKS-4DVAR iterations for
different values of the regularization parameter γ, for the two-level
quasi-geostrophic model (Sect. ).
For all the experiments presented here, observations of non-dimensional
stream function, vector wind and wind speed were taken from a truth of the
model at 100 points randomly distributed over both levels. Observations
were taken every 12 h. We note that the number of observations is much
smaller than the dimension of the state vector. Observation errors were
assumed to be independent of all others and uncorrelated in time. The
standard deviations (SD) were chosen to be equal to 0.4 for stream function
observation error, 0.6 for vector wind, and 1.2 for wind speed. The
observation operator is the bi-linear interpolation of the model fields to
horizontal observation locations.
The background error covariance matrix (matrix B) and the model
error covariances (matrices Qi) used in these experiments
correspond to vertical and horizontal correlations. The vertical and
horizontal structures are assumed to be separable. In the horizontal plane,
covariance matrices correspond to isotropic, homogeneous correlations of a
stream function with Gaussian spatial structure obtained from a fast Fourier
transform approach . For the
background covariance matrix B, the SD and the horizontal
correlation length scale in these experiments were set to 0.8 and
106m, respectively. For the model error covariance matrices
Qi, the SD and the horizontal correlation length scale were set
to 0.2 and 2×106 m, respectively. The vertical correlation is
assumed to be constant over the horizontal grid, and the correlation
coefficient value between the two layers was taken as 0.5 for
Qi and 0.2 for B.
Numerical results
We perform one cycle for the experiments. The window length is set to 10
days when nonlinearity is increasing Fig. 2, with two
sub-windows of 5 days (L=2). No localization is used in the experiments;
as a result the ensemble size is chosen to be large enough,
N= 30 000. Therefore, this test is only a partial assessment.
Localization and cycling in the QG model are beyond the scope of this paper.
For the finite difference approximation, the parameter τ is set to
10-4 for all experiments. We have performed experiments for incremental
4DVAR and EnKS-4DVAR. The incremental 4DVAR method used conjugate gradients
to solve the linearized problem with exact tangent and adjoint models in each
iteration, with no ensembles involved. The numerical results are presented as
follows.
Figure shows the objective function values along
iterations of the incremental 4DVAR method. The objective function oscillates
with the iteration number; therefore, the incremental 4DVAR method without
regularization diverges. This divergence is due to the highly nonlinear
behavior of the model for a long window (10 days). In such a case, as
explained in Sect. , a convergence to a stationary point can be
recovered by controlling the step, which is done by introducing an additional
regularization term in this study. In order to see the effect of this
regularization, we performed EnKS-4DVAR with different values of the
regularization parameter γ. Figure shows
the objective function values along iterations for eight different choices of
γ. RMSE values along the iterations for the same experiments performed
with 4DVAR and EnKS-4DVAR are presented in Table .
Objective function values along incremental
4DVAR iterations for the two-level quasi-geostrophic problem from
Sect. .
Objective function values along
EnKS-4DVAR with regularization iterations for the two-level quasi-geostrophic
problem (Sect. ).
It can be seen from Fig. that when γ=0,
the iterations diverge as expected, since we do not use regularization and we
only approximate the linearized subproblem using ensembles. For small values
of γ (e.g., γ≤10-1), the objective function is not
monotonically decreasing; hence, the iterations are still diverging even if
we use the regularization. Therefore, small values of γ can not
guarantee the convergence. For large values of γ (e.g.,
γ≥10), we can observe the decrease on the objective function along
iterations. Moreover, the fastest decrease on the objective function is
obtained for γ=10.
If we look at the RMSE values from Table , we can
see that increasing γ beyond an optimal value results in higher RMSE
values, and the reduction in RMSE values becomes very slow. In any case, the
RMSE values oscillate along the iterations. We note that all RMSE values are
lower than the initial RMSE value.
In conclusion, when the regularization is used, the choice of the
regularization parameter γ is crucial to ensure the convergence. For
instance, for small values of γ, the method can still diverge, and for
large values of γ, the objective function decreases, but slowly (and
many iterations may be needed to attain some predefined decrease). On the
other hand, small γ values result in small RMSE values with
oscillation along the iterations, and RMSE values decrease slowly for the
larger values of γ. Therefore the regularization parameter should be
neither “very small” nor “very large”. An adaptive γ over
iterations can be a better compromise, which will be explored in future
studies.
Conclusions
We have proposed a stochastic solver for the incremental 4DVAR
weak-constraint method. The regularization term added to the Gauss–Newton
method, resulting in a globally convergent Levenberg–Marquardt method,
maintains the structure of the linearized least-squares subproblem, enabling
us to use an ensemble Kalman smoother as a linear solver while simultaneously
controlling the convergence. We have formulated the EnKS-4DVAR method and
have shown that it is capable of handling strongly nonlinear problems. We
have demonstrated that the randomness of the EnKS version used (with
perturbed data) eventually limits the convergence to a minimum, but
a sufficiently large decrease in the objective function can be achieved for
successful data assimilation. On the contrary, we suspect that the
randomization may help to increase the supply of the search directions over
the iterations, as opposed to deterministic methods locked into one
low-dimensional subspace, such as the span of one given ensemble.
We have numerically illustrated the new method on the Lorenz 63 model and the
two-level quasi-geostrophic model. We have analyzed the impact of the finite
differences parameter τ used to approximate the derivatives of the model
and observation operators. We have shown that for τ=1, the iterates
obtained from EnKS-4DVAR are equivalent to those obtained from the standard
EnKS. Based on computational experiments, it may be better to start with the
EnKS (i.e., τ=1) and then to decrease τ in further
iterations.
We have demonstrated long-term stability of the method on the Lorenz 63
model and shown that it achieves lower RMSE than standard EnKF for a highly
nonlinear problem. This, however, took some parameter tuning, in particular
the data error variance.
For the second part of the experiments, we have shown the performance of the
EnKS-4DVAR method with regularization on the two-level quasi-geostropic
problem, one of the widely used models in theoretical atmospheric studies,
since it is simple enough for numerical calculations and it adequately
captures an important aspect of large-scale dynamics in the atmosphere. We
have observed that the incremental 4DVAR method does not converge for a long
assimilation window length, and that the regularization is necessary to
guarantee convergence. We have concluded that the choice of the
regularization parameter is crucial to ensure the convergence, and different
choices of this parameter can change the rate of decrease in the objective
function. As a summary, an adaptive regularization parameter can be a better
compromise to achieve the approximate solution in a reasonable number of
iterations.
The choice of the parameters used in our approach is of crucial importance
for the computational cost of the algorithm, for instance the number of
iterations to obtain some desired reduction. The exploration in more detail
of the best strategies to adapt these parameters' course of the iterations
will be studied elsewhere.
The base method, used in the computational experiments here, is using sample
covariance. However, there is a priori nothing to prevent the use of more
sophisticated variants of EnKS with localization and the covariance
inflation, and square root filters instead of EnKS with data perturbation, as
is done in related methods in the literature. These issues, as well as the
performance on larger and realistic problems, will be studied elsewhere.
Acknowledgements
This research was partially supported by Fondation STAE project ADTAO, the
Czech Science Foundation under grant GA13-34856S, and the US National Science
Foundation under grant DMS-1216481. A part of this work was done when Jan
Mandel was visiting INP-ENSEEIHT and CERFACS, and when Elhoucine Bergou,
Serge Gratton, and Ivan Kasanický were visiting the University of Colorado
Denver. The authors would like to thank the editor, Olivier Talagrand,
reviewer Emmanuel Cosme, and an anonymous reviewer for their comments, which
contributed to the improvement of this paper.Edited by:
O. TalagrandReviewed by: E. Cosme and one anonymous referee
ReferencesBell, B.: The Iterated Kalman Smoother as a Gauss-Newton Method, SIAM
J. Optim., 4, 626–636, 10.1137/0804035, 1994.
Bergou, E., Gratton, S., and Mandel, J.: On the Convergence of a Non-linear
Ensemble Kalman Smoother, arXiv:1411.4608, submitted, 2014.Bocquet, M. and Sakov, P.: Combining inflation-free and iterative ensemble
Kalman filters for strongly nonlinear systems, Nonlin. Processes Geophys., 19,
383–399, 10.5194/npg-19-383-2012, 2012.Bocquet, M. and Sakov, P.: Joint state and parameter estimation with an
iterative ensemble Kalman smoother, Nonlin. Processes Geophys., 20,
803–818, 10.5194/npg-20-803-2013, 2013.Bocquet, M. and Sakov, P.: An iterative ensemble Kalman smoother,
Q. J. Roy. Meteor. Soc., 140, 1521–1535,
10.1002/qj.2236, 2014.Brusdal, K., Brankart, J. M., Halberstadt, G., Evensen, G., Brasseur, P., van
Leeuwen, P. J., Dombrowsky, E., and Verron, J.: A demonstration of ensemble
based assimilation methods with a layered OGCM from the perspective of
operational ocean forecasting systems, J. Mar. Syst., 40–41,
253–289, 10.1016/S0924-7963(03)00021-6, 2003.Butala, M. D.: A localized ensemble Kalman smoother, in: 2012 IEEE
Statistical Signal Processing Workshop (SSP), 21–24, IEEE,
10.1109/SSP.2012.6319665, 2012.Chen, Y. and Oliver, D.: Levenberg-Marquardt forms of the iterative
ensemble smoother for efficient history matching and uncertainty
quantification, Comp. Geosci., 17, 689–703,
10.1007/s10596-013-9351-5, 2013.Cosme, E., Brankart, J.-M., Verron, J., Brasseur, P., and Krysta, M.:
Implementation of a reduced rank square-root smoother for high resolution
ocean data assimilation, Ocean Model., 33, 87–100,
10.1016/j.ocemod.2009.12.004, 2010.Courtier, P., Thépaut, J.-N., and Hollingsworth, A.: A strategy for
operational implementation of 4D-Var, using an incremental approach,
Q. J. Roy. Meteor. Soc., 120, 1367–1387,
10.1002/qj.49712051912, 1994.Desroziers, G., Camino, J.-T., and Berre, L.: 4DEnVar: link with 4D state
formulation of variational assimilation and different possible
implementations, Q. J. Roy. Meteor. Soc., 140,
2097–2110, 10.1002/qj.2325, 2014.Developmental Testbed Center: NOAA Ensemble Kalman Filter Beta Release
v1.0, http://www.dtcenter.org/com-GSI/users/docs (last access: March 2015),
2015.
Dietrich, C. R. and Newsam, G. N.: Fast and Exact Simulation of Stationary
Gaussian Processes through Circulant Embedding of the Covariance Matrix, SIAM
J. Sci. Comp., 18, 1088–1107, 1997.Evensen, G.: Data Assimilation: The Ensemble Kalman Filter, Springer, 2nd
edn., xxiv+307, 10.1007/978-3-642-03711-5, 2009.
Fandry, C. and Leslie, L.: A Two-Layer Quasi-Geostrophic Model of Summer
Trough Formation in the Australian Subtropical Easterlies, J. Atmos. Sci., 41, 807–817, 1984.
Fisher, M., Tr'emolet, Y., Auvinen, H., Tan, D., and Poli, P.:
Weak-Constraint
and Long-Window 4D-Var, Tech. rep., European Centre for Medium-Range Weather
Forecasts, 2011.Fisher, M., Leutbecher, M., and Kelly, G. A.: On the equivalence between
Kalman smoothing and weak-constraint four-dimensional variational data
assimilation, Q. J. Roy. Meteor. Soc., 131,
3235–3246, 10.1256/qj.04.142, 2005.Furrer, R. and Bengtsson, T.: Estimation of high-dimensional prior and
posterior covariance matrices in Kalman filter variants, J. Multivar.
Anal., 98, 227–255, 10.1016/j.jmva.2006.08.003, 2007.Gill, P. E. and Murray, W.: Algorithms for the solution of the nonlinear
least-squares problem, SIAM J. Numer. Anal., 15, 977–992,
10.1137/0715063, 1978.Gratton, S., Gürol, S., and Toint, P.: Preconditioning and globalizing
conjugate gradients in dual space for quadratically penalized nonlinear-least
squares problems, Computational Opt. Appl., 54, 1–25,
10.1007/s10589-012-9478-7, 2013.Gu, Y. and Oliver, D.: An iterative ensemble Kalman filter for multiphase
fluid flow data assimilation, SPE J., 12, 438–446,
10.2118/108438-PA, 2007.Hamill, T. M. and Snyder, C.: A Hybrid Ensemble Kalman Filter–3D
Variational Analysis Scheme, Mon. Weather Rev., 128, 2905–2919,
10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2, 2000a.Hamill, T. M. and Snyder, C.: A Hybrid Ensemble Kalman Filter–3D
Variational Analysis Scheme, Mon. Weather Rev., 128, 2905–2919,
10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2,
2000b.Johns, C. J. and Mandel, J.: A Two-Stage Ensemble Kalman Filter for Smooth
Data Assimilation, Environ. Ecol. Stat., 15, 101–110,
10.1007/s10651-007-0033-0, 2008.Kasanický, I., Mandel, J., and Vejmelka, M.: Spectral diagonal ensemble
Kalman filters, Nonlin. Processes Geophys., 22, 485–497,
10.5194/npg-22-485-2015, 2015.Khare, S. P., Anderson, J. L., Hoar, T. J., and Nychka, D.: An investigation
into the application of an ensemble Kalman smoother to high-dimensional
geophysical systems, Tellus A, 60, 97–112,
10.1111/j.1600-0870.2007.00281.x, 2008.
Levenberg, K.: A method for the solution of certain non-linear problems in
least squares, Q. Appl. Math., 2, 164–168, 1944.Liu, C. and Xiao, Q.: An Ensemble-Based Four-Dimensional Variational Data
Assimilation Scheme. Part III: Antarctic Applications with Advanced
Research WRF Using Real Data, Mon. Weather Rev., 141, 2721–2739,
10.1175/MWR-D-12-00130.1, 2013.Liu, C., Xiao, Q., and Wang, B.: An Ensemble-Based Four-Dimensional
Variational
Data Assimilation Scheme. Part I: Technical Formulation and Preliminary
Test, Mon. Weather Rev., 136, 3363–3373, 10.1175/2008MWR2312.1,
2008.Liu, C., Xiao, Q., and Wang, B.: An Ensemble-Based Four-Dimensional
Variational
Data Assimilation Scheme. Part II: Observing System Simulation Experiments
with Advanced Research WRF (ARW), Mon. Weather Rev., 137, 1687–1704,
10.1175/2008MWR2699.1, 2009.Lorenc, A. C., Bowler, N. E., Clayton, A. M., Pring, S. R., and Fairbairn,
D.:
Comparison of Hybrid-4DEnVar and Hybrid-4DVar Data Assimilation Methods
for Global NWP, Mon. Weather Rev., 143, 212–229,
10.1175/MWR-D-14-00195.1, 2014.Lorenz, E. N.: Deterministic Nonperiodic Flow, J. Atmos. Sci., 20, 130–141,
10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2, 1963.Mandel, J., Beezley, J. D., Coen, J. L., and Kim, M.: Data Assimilation for
Wildland Fires: Ensemble Kalman filters in coupled atmosphere-surface
models, IEEE Contr. Syst. Mag., 29, 47–65,
10.1109/MCS.2009.932224, 2009.Marquardt, D. W.: An Algorithm for Least-Squares Estimation of Nonlinear
Parameters, Journal of the Society for Industrial and Applied Mathematics,
11, 431–441, 10.1137/0111030, 1963.Metref, S., Cosme, E., Snyder, C., and Brasseur, P.: A non-Gaussian
analysis scheme using rank histograms for ensemble data assimilation,
Nonlin. Processes Geophys., 21, 869–885, 10.5194/npg-21-869-2014, 2014.
Nowak, W., Tenkleve, S., and Cirpka, O.: Efficient Computation of Linearized
Cross-Covariance and Auto-Covariance Matrices of Interdependent Quantities,
Math. Geol., 35, 53–66, 2003.Osborne, M. R.: Nonlinear least squares – the Levenberg algorithm
revisited, J. Austr. Math. Soc. B, 19, 343–357,
10.1017/S033427000000120X, 1976.Ott, E., Hunt, B. R., Szunyogh, I., Zimin, A. V., Kostelich, E. J., Corazza,
M., Kalnay, E., Patil, D., and Yorke, J. A.: A Local Ensemble Kalman Filter
for Atmospheric Data Assimilation, Tellus, 56A, 415–428,
10.1111/j.1600-0870.2004.00076.x, 2004.
Pedlosky, J.: Geophysical Fluid Dynamics, Springer, xii+625,
1979.
Rauch, H. E., Tung, F., and Striebel, C. T.: Maximum likelihood estimates of
linear dynamic systems, AIAA J., 3, 1445–1450, 1965.Sakov, P. and Bertino, L.: Relation Between Two Common Localisation Methods
for the EnKF, Comp. Geosci., 10, 225–237,
10.1007/s10596-010-9202-6, 2011.Sakov, P., Oliver, D. S., and Bertino, L.: An Iterative EnKF for Strongly
Nonlinear Systems, Mon. Weather Rev., 140, 1988–2004,
10.1175/MWR-D-11-00176.1, 2012.Strang, G. and Borre, K.: Linear Algebra, Geodesy, and GPS,
Wellesley-Cambridge Press, 624 pp., 1997.
Stroud, J. R., Stein, M. L., Lesht, B. M., Schwab, D. J., and Beletsky, D.:
An
Ensemble Kalman Filter and Smoother for Satellite Data Assimilation,
J. Am. Stat. Assoc., 105, 978–990,
10.1198/jasa.2010.ap07636, 2010.Trémolet, Y.: Model-error estimation in 4D-Var, Q. J. Roy. Meteor. Soc.
Roy. Meteor. Soc., 133, 1267–1280, 10.1002/qj.94, 2007.Trémolet, Y.: Object-Oriented Prediction System,
http://www.data-assimilation.net/Events/Year3/OOPS.pdf
(last access: 19 February 2016), 2013.Tshimanga, J., Gratton, S., Weaver, A. T., and Sartenaer, A.: Limited-memory
preconditioners, with application to incremental four-dimensional variational
data assimilation, Q. J. Roy. Meteor. Soc.,
134, 751–769, 10.1002/qj.228, 2008.Wang, X.: Incorporating Ensemble Covariance in the Gridpoint Statistical
Interpolation Variational Minimization: A Mathematical Framework, Mon.
Weather Rev., 138, 2990–2995, 10.1175/2010MWR3245.1, 2010.Wright, S. J. and Holt, J. N.: An inexact Levenberg-Marquardt method for
large sparse nonlinear least squares, J. Austr. Math. Soc. Ser. B, 26,
387–403, 10.1017/S0334270000004604, 1985.Zhang, F., Zhang, M., and Hansen, J.: Coupling ensemble Kalman filter with
four-dimensional variational data assimilation, Adv. Atmos.
Sci., 26, 1–8, 10.1007/s00376-009-0001-8, 2009.Zupanski, M.: Maximum Likelihood Ensemble Filter: Theoretical Aspects,
Mon. Weather Rev., 133, 1710–1726, 10.1175/MWR2946.1, 2005.