Parametric covariance dynamics for the nonlinear diffusive Burgers equation

The parametric Kalman filter (PKF) is a computationally efficient alternative method to the ensemble Kalman filter (EnKF). The PKF relies on an approximation of the error covariance matrix by a covariance model with a space-time evolving set of parameters. This study extends the PKF to nonlinear dynamics using the diffusive Burgers equation as an application, focusing on the forecast step of the assimilation cycle. The covariance model considered is based on the diffusion equation, with the diffusion tensor and the error variance as evolving parameter. An analytical derivation of the parameter dynamics 5 highlights a closure issue. Therefore, a closure model is proposed based on the kurtosis of the local correlation functions. Numerical experiments compare the PKF forecast with the statistics obtained from a large ensemble of nonlinear forecasts. These experiments strengthen the closure model and demonstrate the ability of the PKF to reproduce the tangent-linear covariance dynamics, at a low numerical cost. Copyright statement. TEXT 10


Introduction
Covariance functions in geophysical flows are known to evolve in both time and space (e.g.Bouttier, 1993;Snyder et al., 2003).Yet, an accurate solution of the covariance dynamics is one of the major challenges in data assimilation and probabilistic forecasting.The Monte Carlo method, which is the most common approach, addresses nonlinear dynamics and is computationally efficient with parallel computers.However, it suffers from non-uniform sampling noise, which is a function of the true signal covariance.
Another route can be investigated that relies on analytical derivation of covariance tensor dynamics (Cohn, 1993), which has inspired application in chemical transport models (Ménard et al., 2000).Despite the theoretical interest resulting from the analytical derivation of covariance dynamics, it is still difficult to take advantage of this formulation in real applications.Moreover, the system presents a closure problem for the diffusive error dynamics.A hybrid approach that mixes the Monte Carlo method based on an ensemble and an approximate propagation of the correlations by a surrogate model has also been proposed (Bocquet, 2016).
An intermediate formulation, between the approximation by an ensemble and the theoretical formulation by analytic derivation, has recently been introduced by Pannekoucke et al. (2016) who proposed to approximate the forecast error covariance matrix by a parametric covariance matrix, in which the dynamics of parameters stand for the dynamics of the full covariance matrix.This formulation, called the parametric Kalman filter (PKF), has been illustrated on linear advection-diffusion equation similarly to the equations encountered in chemical transport models.As defined for general parametric covariance models, the PKF has been illustrated for the particular case in which the covariance model is based on the diffusion equation (Weaver and Courtier, 2001).Hence, the error covariance matrix is reduced to the knowledge of its variance field and its local diffusion tensor field.The time evolution all along the forecast and the analysis steps of the data assimilation process is expressed in terms of variance and local diffusion tensor evolution.
As mentioned earlier, the PKF formulation has been tested so far on linear dynamics.It is thus interesting for more gen-Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.
O. Pannekoucke et al.: Parametric covariance dynamics for the nonlinear diffusive Burgers equation eral applications to consider extension to a nonlinear setting.The goal of the present work is to formulate and illustrate the forecast step of the PKF for the nonlinear dynamics given by the Burgers equation.
The Burgers equation is a nonlinear advection-diffusion model that usually involves one variable in a onedimensional space -u -the wind.It is one of the simplest equations that display important features of geophysical interest, such as advection, frontogenesis, and one-dimensional turbulence (Burgers, 1974;Hopf, 1950;el Malek and El-Mansi, 2000 and references therein).The Burgers equation has been used in several data assimilation studies to examine the effect of nonlinearity error propagation and in Kalman filtering methods (Cohn, 1993;Ménard, 1994;Verlaan and Heemink, 2001), in maximum likelihood ensemble filtering (Zupanski et al., 2008), in adjoint methods (Apte et al., 2010), in model error estimation using 4D-Var (Lakshmivarahan et al., 2013), and in 4DEnVar and localization (Desroziers et al., 2014(Desroziers et al., , 2016)).
However, preliminary numerical tests have shown that the treatment of the physical diffusion, as proposed in Pannekoucke et al. (2016) and deduced from analytical solutions, was not able to reproduce the complexity of the Burgers dynamics.Hence, we need to develop a higher order representation of the PKF equation for the physical diffusion process.
In Sect. 2 the parametric formulation based on the covariance modelling with the diffusion equation is first recalled, and we specify the methodology for developing the parametric dynamics under a systematic treatment.This method is then applied to the Burgers equation, in Sect.3, taking advantage of the operator splitting.In Sect.4, numerical simulations are conducted to illustrate the ability of the parametric dynamics to reproduce the main features of the true covariance dynamics emerging from a forecast Monte Carlo experiment.The conclusions are given in Sect. 5.

Background on the uncertainty propagation and covariance dynamics
Geophysical flow dynamics can be represented as a nonlinear system of the form which describes the time evolution of a state function u and where a unique solution is assumed to exist for any initial condition u 0 within an appropriate set.Due to the lack of precise knowledge of the initial condition, u 0 is generally represented as a random state u 0 = u 0 + ε 0 , where ε 0 is a centred Gaussian random field characterized by its two-point covariance function P 0 (x, y) = ε 0 (x)ε 0 (y), where (•) stands for the expectation operator.The covariance function can be described by its variance field V 0 , where V 0 x = P 0 (x, x) denotes the variance at position x and by its error correlation function ρ 0 (x, y) = P 0 (x, y)/σ 0 x σ 0 y , where σ 0 x = V 0 x and σ 0 y = V 0 y are the standard deviations at point x and y, respectively.When a first-order Taylor expansion accurately approximates the error dynamics, then the tangent linear dynamics of the initial error ε 0 makes the error about the mean state u evolve.M = ∂ u M |u is the tangent linear dynamics along the nonlinear trajectory, u, solution of Eq. ( 1) starting from the initial condition u 0 .The two-point covariance function ε(x, t)ε(y, t) of the error field ε at a given time t defines the covariance function P (x, y, t).Thereafter, the covariance function is computed as a covariance matrix: when space is discretized, with the grid-point positions denoted by x i , the restriction of the covariance function to the grid-point positions is the matrix P defined by P ij (t) = P (x i , x j , t).With the discretized version of the tangent linear model M now being the matrix M, the dynamics of the covariance matrix is then given by the forecast error covariance equation where M T stands for the adjoint of the tangent linear model M. Thereafter, since the statistics depend on the time evolution, explicit reference to the time t is dropped, except for the initial time t = 0 identified by superscript (•) 0 .The numerical cost of solving Eq. (3) for high-dimensional dynamics is beyond supercomputer capacity.Different options have been considered in the literature to approximate the solution, among which one finds the Monte Carlo method employed in the ensemble Kalman filter (Evensen, 1994).
The ensemble Kalman filter is a robust algorithm that applies to low-order dynamical systems as well as to large dimension systems encountered in geophysical applications.The main difference for geophysical applications is that the covariance matrix is closely related to the continuous covariance function, which may not be the case for all discrete loworder models.Thereafter, it is assumed that a discrete model results from the discretization of a continuous model, making a clear connection between the discrete and the continuous covariance representations.This offers simplifications in the following derivations.To that end, in what follows, the covariance function P (x, y) and its grid-point matrix representation P are considered as equivalent and are denoted by the matrix notation.
We now give details about another approximation, which relies on the continuum, namely the parametric formulation.

Parametric formulation of the covariance forecast dynamics
Pannekoucke et al. (2016) have proposed to parameterize the covariance matrix by the covariance model, and they have illustrated this approach by considering the covariance model based on the pseudo-time diffusion equation (Weaver and Courtier, 2001).
The diffusion covariance model factorizes the covariance matrix as where denotes the diagonal matrix of standard deviation ε 2 , and L is the propagator of the diffusion equation integrated from the pseudo-time τ from τ = 0 to τ = 1/2, thus giving L = e 1 2 ∇•(ν∇) .The pseudo-time diffusion equation is a recipe to build Gaussian random fields with Gaussian-like correlation functions.Note that the pseudotime τ has no link with the physical time t of Eq. (1).In this formulation the variance field V (deduced from ) and the local diffusion tensor field (ν) are the only parameters to be determined.
Knowing the dynamics of the variance field V and the local diffusion tensor field ν provides a means to approximate the true covariance dynamics (Eq.3), where P would be replaced by the covariance model P diff., parameterized by using the diffusion equation (Eq.4).This constitutes the parametric formulation of the dynamics.The challenge is now to determine the dynamics of the two parameter fields.
The dynamics of the variance field V = ε 2 can straightforwardly be obtained from the trend ∂ t ε following However, the dynamics of the diffusion tensor is not as obvious to derive.A possible way to describe its dynamics is to consider some approximations that we will describe in the next section.

Approximate dynamics for the diffusion covariance model
The dynamical equations of the local diffusion can be obtained taking advantage of approximations used in data assimilation for the estimation of the local diffusion tensor from ensemble data.Following Pannekoucke and Massart (2008), Mirouze and Weaver (2010), and Weaver and Mirouze (2013), the local diffusion tensor field can be deduced from the correlation functions when assuming that the random error field is smooth.For a given position x, the local Taylor expansion of the correlation function ρ(x, x + δx) is related to the local correlation function in the form where g x denotes the local metric tensor at point x, with ||δx|| 2 E = δx T Eδx.In this expression, the little o means that for two functions, f 1 and |f 2 (x)| = 0. From Eq. ( 7) we can define a diffusion tensor at x by The importance of the metric tensor comes from its direct connection with the error field.In dimension one, the metric is the scalar g x = (∂ x ε x ) 2 , where ε denotes the normalized error field ε x = ε x σ x (see Appendix A).It is meaningful to relate the metric to a typical scale of correlation, the socalled error correlation length scale (Daley, 1991;Pannekoucke et al., 2008): In dimension two (three), the metric is a 2×2 (3×3) matrix g x = [g ij (x)] given by Consequently, an approximation for the dynamics of the parametric formulation based on the diffusion equation is given by Equation (11a and 11b) have the advantage that we should be able to compute the time evolution of covariances for any error dynamics.This will be illustrated with the Burgers equation, which is a one-dimensional dynamical model with nonlinear advection and diffusion processes similar to those of geophysical flows.

Dynamics of forecast error for the Burgers model
Here, we consider the dynamics associated with the Burgers equation in 1-D: For any smooth function u 0 (x), there exists a unique solution u(x, t) to Eq. ( 12) with the initial condition u(x, 0) = u 0 (x).A particular initial condition is now considered, where u 0 (x) is a sample of a smooth random field of mean field u 0 (x).Hence, each sample u 0 is decomposed as u 0 (x) = O.Pannekoucke et al.: Parametric covariance dynamics for the nonlinear diffusive Burgers equation u 0 (x) + ε 0 (x), where ε 0 is a smooth random field.The dynamics of the mean u(x, t) and of higher order statistical moments are obtained from the Reynolds equations.Similarly to Cohn (1993), the fluctuation-mean flow dynamics deduced from the Reynolds equations is considered, in place of the more classical tangent linear dynamics.Compared with the tangent linear dynamics, fluctuation-mean flow dynamics calculates the mean flow depending on the fluctuation statistics evolution, which enlarges the tangent linear setting.Note that the fluctuation-mean flow interaction leads to the Gaussian second-order filter (Jazwinski, 1970, Sect. 9.3), and is important in nonlinear Kalman-like filters (Cohn, 1993).The next section presents the fluctuation-mean flow dynamics and how it is used to describe the time evolution (Eq.11a, 11b) of the two-point error covariance parameters.

Derivation of the fluctuation-mean flow dynamics for small error magnitudes
The random field u can be decomposed into its ensembleaveraged and fluctuating parts u = u + ε, where u(x, t) = u(x, t) denotes the expectation of the random field u, and ε = u − u is a random field of zero mean.From this expansion, the mean flow dynamics is the ensemble average of the dynamics.Equation ( 12) then reads The dynamics of the fluctuation ε is deduced from the difference between the full dynamics (Eq.12) and the mean flow dynamics (Eq.13a), yielding Hence, the dynamics of the mean flow and of the fluctuations are described by the coupled system (Eq.13a, 13b).
Note that the term −ε∂ x ε is the offset of the mean state due to the fluctuations.The offset term does not affect the statistical properties of the perturbations ε, while it is crucial to the dynamics of u.From the commutativity of the ensemble mean with spatial derivative, If the magnitude of the perturbation ε is small, Eq. (13a, 13b) can be simplified into the fluctuation-mean flow dynamics where the product ε∂ x ε has been discarded while keeping the fluctuation-mean flow interaction term 1 2 ∂ x V = ε∂ x ε.Note that the tangent linear dynamics corresponds to Eq. (14a, 14b) but where the offset term 1 2 ∂ x V is discarded.Moreover, as pointed out in Ménard (1994), Eq. ( 14a) is the exact ensemble mean for the Burgers dynamics, while Eq. ( 14b) is an approximation for the dynamics.As a consequence, if the variance field is the true one, then the mean predicted by Eq. ( 14a) is the true ensemble mean (Ménard, 1994, Sect. 5.5.2).
The aim is now to determine the dynamics of the twopoint error covariance function, ε x ε y , which corresponds, after spatial discretization, to the time evolution of the covariance matrix P in data assimilation.Following the splitting strategy developed in Pannekoucke et al. (2016), the evolution of the perturbation ε is decomposed considering the effect of each process.The splitting strategy is a theoretical method to deduce the so-called infinitesimal generator of an evolution equation, by taking advantage of the Lie-Trotter formula to separate each processes (or appropriate arrangements of the processes).This strategy should not be confused with numerical time-splitting, which introduces numerical errors (Sportisse, 2007).Here, as seen in Eq. ( 14b), four processes influence the error statistics: (i) a production term due to the transport of the mean flow by the perturbation −ε∂ x u, (ii) the transport of the perturbation by the mean flow u∂ x ε, (iii) a diffusion term κ∂ 2 x ε, and (iv) an offset term 1 2 ∂ x V due to the averaged nonlinear self-interaction of the perturbation ε∂ x ε.
Since the offset (iv) modifies the mean but not the higher statistical moments of ε and without loss of generality, only the first three elementary processes are needed for the description of the covariance dynamics: The effect of each process in Eq. (15a, 15b, 15c) onto the dynamics (Eq.11a, 11b) of the variance and the local diffusion tensor is now described.

Separate contribution of elementary processes
The contribution of the production term (Eq.15a) is first examined, then the transport (Eq.15b), and finally the diffusion (Eq.15c).

Contribution of the production term
The production term describes the amplification of the error due to the gradient of the mean field u.This process can be viewed as a diagonal operator in the function space where the random field ε lies.As a consequence, this error dynamics affects the variance but not the metric tensor.This leads to the following parameter dynamics:

Contribution of the transport term
The time evolution of the variance and the diffusion fields due to the transport term (Eq.15b) is now tackled.Since the derivation is archetypal of how to proceed, the calculus is detailed.
The dynamics of error variance fields, deduced from Eq. (11a), yields From the commutation of the ensemble average and the partial derivative, this simplifies to Since V = σ 2 , the dynamics of the standard deviation is given by The dynamics of the metric tensor is deduced from Eq. (11b): With the normalized error ε = 1 σ ε and the dynamics of the standard deviation (Eq.19), the dynamics of the metric reads From the identity ∂ x (∂ x ε∂ x ε) = 2∂ x ε∂ 2 x ε, and from g = (∂ x ε) 2 , we obtain Hence, the variance and the local diffusion ν = 1 2g evolve following These equations represent the transport of the variance and of the diffusion by the mean flow: the variance is conserved, while the diffusion tensor is warped by the mean flow.

Contribution of the diffusion term
Following the same procedure, the dynamics of V and g (Eq.11a, 11b) is given for the diffusion process (Eq.15c) by As is expected while dealing with Reynolds equations, a closure problem appears since the term (∂ 2 x ε) 2 cannot be deduced from either V or g.Hence, a parameterization is needed to proceed.
To proceed further, we take advantage of the link between the unknown quantity (∂ 2 x ε) 2 and the fourth-order term K x of the Taylor expansion of the error correlation function (see Appendix A): where The quantities S x and K x are later called the skewness and the kurtosis of the correlation function ρ(x, •).Note that due to the symmetry of the two-point correlation functions, ρ(x, y) = ρ(y, x), the skewness S x is entirely determined by the metric field g.As a result, a choice of the kurtosis implies choice of the closure.
Two particular cases are interesting to discuss: when the random field is statistically homogeneous and, moreover, when the correlation function is a Gaussian function.In the case where the error random field is homogeneous, the error correlation function is homogeneous too: ρ(x, y) = ρ(x + δ, y + δ), ∀δ ∈ R. As a result, the fields of metric, skewness, and kurtosis are constant fields denoted by g h , S h , and K h .Due to the homogeneity of the metric field g h , the skewness (Eq.24b) is zero, and the kurtosis (Eq.24c In the case where the homogeneous correlation G , where L G stands for the homogeneous error correlation length scale, the Taylor expansion reads , S G = 0, and We propose to use these results to formulate a closure model: for a general smooth error random field of the metric field g x , the kurtosis K x (Eq.24c) is approximated by where the first term of the right hand side is the kurtosis of the equivalent local homogeneous Gaussian correlation function.This closure is hereby called the locally homogeneous Gaussian closure or simply the Gaussian closure.
With this Gaussian closure (Eq.26), the dynamics of diffusion (Eq.22a, 22b) is In the one dimensional case, the dynamics of the local diffusion tensor is deduced from the dynamics of the metric from ∂ t ν = −2ν 2 ∂ t g with g = 1 2ν .Thus, the variance and local diffusion tensor evolution are equivalently expressed as Contrary to the production (Eq.16a, 16b) and the transport (Eq.21a, 21b) processes, the effect of the diffusion process (Eq.28a, 28b) creates a nonlinear coupling between the variance and the local diffusion field.
The parametric covariance dynamics for the Burgers equation is now expressed collecting all these results.

Parametric covariance dynamics of the Burgers equation
From Eq. (16a, 16b), Eq. (21a, 21b), and Eq.(28a, 28b), the complete parametric covariance dynamics for the Burgers equation under the Gaussian closure is given by the coupled system Equation (29a, 29b, and 29c) exhibit a nonlinear coupling between the variance (Eq.29b) and the local diffusion tensor (Eq.29c), which illustrates the intricacy of the action of the diffusion process on the error dynamics.Moreover, Eq. (29a, 29b, and 29c) differ from its tangent linear equivalent by the term − 1 2 ∂ x V in Eq. (29a).A numerical experiment is now proposed to illustrate and assess these theoretical results.

Numerical experiment
A numerical experiment is proposed to illustrate the ability of the PKF forecast to reproduce the statistical evolution of the errors in the diffusive Burgers model.The numerical setting is first introduced, followed by an evaluation of the kurtosis closure.Then, the PKF is assessed using a large ensemble of nonlinear forecasts (6400 members).A sensitivity test on the different terms in the PKF concludes the section.

Numerical setting
For the numerical validation, a front-like situation is considered on a periodic domain of length D = 1000 km, discretized with N = 241 grid points.The initial reference state, shown in Fig. 1, is the velocity field u 0 (x) = U max 1 + cos(2π(x − D/4)/D) /2, with U max = 20 km h −1 .From the nonlinear forecast of Eq. ( 12) starting from u 0 , the maximum, initially at 250 km, develops a front structure at 750 km after T = 24 h of forecast.The simulations considered here are integrated from the initial time t = 0 to the final time t = T .The Burgers equation (Eq.12) has been numerically integrating considering a classical finite difference spatial scheme and a fourth-order Runge-Kutta time scheme, with δt = 0.002T and κ = 0.0025D 2 /T .
The random perturbation at initial time, ε 0 , is set as a homogeneous random field of Gaussian distribution.Following Gaspari and Cohn (1999), the homogeneous correlation function is set, in accordance with the geometry of the circle, as the chordal distance Gaussian correlation distance between the two geographical positions x and y.
Since the length scale L G is much smaller than the perimeter D, the Gaussian correlation (Eq.30) with the arc-length distance d(x, y) = |x − y| 2 is numerically very close to the one with the chordal distance (while at the theoretical level, the arc-length Gaussian is not strictly a correlation function on the circle; see Gaspari and Cohn, 1999) and leads to the same numerical results.The covariance function is then defined as where σ 0 is the constant standard-deviation field.Thereafter, four magnitudes of standard deviation σ 0 are considered: σ 0 1 % = 0.01 U max , σ 0 10 % = 0.1 U max , σ 0 20 % = 0.2 U max , and σ 0 50 % = 0.5 U max .The time evolution of the true error covariance functions is computed considering a large ensemble of N e nonlinear forecasts of Eq. ( 12), with N e = 6400.From the non-parametric convergence, the expected sampling error should thus represent about 1/ √ N e = 1.25 % of the real statistics.In order to limit the differences when comparing the results and due to the sampling noise, a single large ensemble of normalized error ( ε k ) k∈[1,N e ] has been generated as ε k = C 1/2 ζ k , where C 1/2 is the square root of the correlation matrix deduced from the correlations (Eq.30), and ζ k is a sample of a random Gaussian noise of zero mean and covariance matrix the identity matrix.The ensemble of initial perturbation is then generated as ε k = σ 0 ε k , with σ 0 ∈ σ 0 1 % , σ 0 10 % , σ 0 20 % , σ 0 50 % .Since the parametric covariance dynamics (Eq.29a, 29b, 29c) has been theoretically derived for small perturbations, it has to be compared with the statistics from the ensemble of small magnitude noise.Hence the validation is later conducted by considering the ensemble generated from the initial standard deviation σ 0 1 % .Limits of predictability of the parametric covariance dynamics (Eq.29a, 29b, 29c) are also addressed considering the ensembles of larger initial uncertainty, from σ 0 10 % to σ 0 50 % .
Figure 2 illustrates the time evolution of the four perturbed initial conditions whose perturbations are generated from the normalized perturbation ε 1 scaled with the initial standard deviations from σ 0 1 % in panel (a) to σ 0 50 % in panel (d).These ensembles are first used to tackle the closure of kurtosis, as discussed now.

Evaluation of the kurtosis closure
The aim of this section is to compare the kurtosis diagnosed from the true error covariance (Eq.24c), with the kurtosis resulting from the Gaussian closure (Eq.26).This validation is a crucial step since the quality of the closure will affect the skill of the parametric covariance dynamics (Eq.29a, 29b, 29c).Even though the closure is likely to be wrong for an arbitrary covariance matrix, it is expected to apply to most statistics encountered in applications.The large ensemble, whose initial perturbations are sampled by using σ 0 = σ 0 10 % , is considered for the validation.The results are equivalent for the other error magnitudes.For this experiment, the error covariance matrices at times t = 0 and t = T are representative of intermediate covariance matrices; the computation of the true kurtosis and its closure is achieved by considering the ensemble at both times.The computation of the closure (Eq.26) relies on the local metric tensor, which has to be diagnosed from the ensemble.
The local metric and kurtosis can be computed from the ensemble considering Eq. ( 24a) and (24c).It is also possible to compute these quantities from the direct estimation of the local correlation function expansion (Eq.23), with the benefit of validating the theoretical derivations made in Eq. (24b) for the skewness and Eq.(24c) for the kurtosis.This motivates the estimation of these quantities from the computation of local polynomial expansions, which are computed as follows.
For each position x, the fourth-order polynomial approximation of the correlation function ρ(x, •) is estimated as the Lagrange interpolating polynomial Q Figure 3 illustrates the results computed from the ensemble at t = 0 (panel a) and t = T (panel b), the length scale L x = 1/ √ g x (top panels) and the kurtosis (bottom panels) shown by the continuous line, normalized by the initial homogeneous Gaussian values, L G and The kurtosis' closure (Eq.26) computed from the metric is shown by the dashed line (bottom panels).
At t = 0, the length scale (the kurtosis) field is homogeneously equal to the initial values L G (K G ).The small fluctuations visible at this time are due to the sampling noise.For t = T , the length scale is larger than at the initial time, and presents an area of small values in the vicinity of the front position x = 0.75D.The kurtosis is negligible, except at the  Note that all the previous results are similar for the smaller initial uncertainty magnitude σ 0 1 % , with a relative error of 5.2 % (14.5 %) at time 0 (T ) (not shown here).For the larger error magnitudes σ 0 20 % and σ 0 50 % , the relative error at time T is 40 % and 63 % respectively.Hence, for this numerical simulation, the Gaussian closure proposed for the kurtosis appears relevant to approximate the real feature of the correlation shape.This is now used to explore the ability of the PKF to accurately predict the error statistics.

Parametric vs. ensemble statistics
The parametric setting is based on the time integration of the nonlinear coupled system (Eq.29a, 29b, 29c) considering an equivalent numerical scheme to the one solving the Burgers equation (Eq.12), i.e. finite difference and RK4, with the same time step as detailed in Sect.4.1.The numerical cost is of the order of a nonlinear time integration of the nonlinear Burgers equation.In this one-dimensional case, only two scalar fields are propagated: the variance V and the local diffusion field ν.
The mean, the error variance, and length-scale fields are reproduced in Figs. 4, 5, and 6, respectively, considering a range of initial errors.These figures compare the diagnosis from the ensemble of nonlinear forecasts of Eq. ( 12) with the statistics predicted by the parametric model (Eq.29a, 29b, 29c).The means diagnosed from the ensemble and predicted by the parametric model are considered first.

Comparison of the means
In order to appreciate the differences between the ensemble and the parametric means, the discussion is focused on the results at the final time T .When the initial error magnitude is small (Fig. 4a), corresponding to the tangent linear regime, the ensemble mean (continuous line) and the mean state predicted from the parametric model (dashed line) coincide with the reference state u 0 (T , x) (grey solid line, reproduced from Fig. 1).This is within the tangent linear validity regime in which the small magnitude of the fluctuation has no impact on the ensemble mean, which is then equal to the reference trajectory.For larger error magnitudes, the ensemble mean is expected to deviate from the reference trajectory due to the nonlinear interaction between the fluctuation and the mean.In the Burgers equation, the deviation is due to − 1 2 ∂ x V in Eq. (29a), which implies here that the ensemble mean decreases as if the diffusion increased with the error magnitude (panels b, c, d).The mean predicted by the parametric model is very close to the ensemble mean (panels b, c) for the moderate error magnitudes of σ 10 % and σ 20 % , but presents an anomalous distortion at the inflexion point for the larger error magnitudes (panel d).Hence, for the particular case of the Burgers dynamics, the parametric prediction of the mean is an accurate approximation of the ensemble mean, even for mild error magnitudes.

Comparison of the variance and length-scale statistics
The variance (Fig. 5) and length-scale (Fig. 6) statistics are now discussed.For the small error magnitude, as seen in panels (a), the uncertainty is spreading at the initial time due to the physical diffusion, resulting in a strong dampening of the variance.This is accompanied by a global increase of the length scale, except in the vicinity of the inflection point located near x = 0.5D (see Fig. 1).Then, as time goes on, the dynamical front generates a source of uncertainty, whereby a beam of variance appears and increases with time, yielding a maximum of 10.0 times the initial variance at t = T .The length scale remains short close to the front position, except for a peak emerging from time t = 0.6T , evolving with the flow at the inflection point.Comparing to the ensemble statistics, the PKF is able to capture all the details of the dynamics.This strongly supports Eq. (29a, 29b, 29c) as well as the underlying assumptions: the derivation of the tangent linear dynamics for the error variance and length-scale fields, and the Gaussian closure for the kurtosis.In particular, the area of large length-scale values visible at the inflection point of the front is a real signal and not a numerical artefact of the diagnosis, since it is produced in both simulations.
The case of the error magnitude of σ 10 % , in which the tangent linear approximation should not be valid anymore, is now considered (see panels b).Key features previously described are still present: the emergence of a beam of uncertainty and the increase of the length scale except in the vicinity of the front.However, two differences appear compared to the ensemble statistics reference.Firstly, the magnitude of the uncertainty is lower than in the tangent linear case; the maximum of variance beam at t = T is now close to 6.0 times the initial variance.Secondly, the local large lengthscale value as depicted in the tangent linear setting is nearly flat at the bottom of the small length-scale basin associated to the front.The main features of the PKF predictions are recovered: the variance beam has a lower magnitude than in the tangent linear case, and there is still a low length-scale area near the front.
Beyond the variance attenuation, a maximum at t = T of 7.8 times the initial variance is much greater than the ensemble statistics result, with a relative error of 29 %.Moreover, the length-scale field displays a peak at the front, similar to the one described for the tangent linear model.
In order to assess the role of the nonlinear term, ε∂ x ε, in the error dynamics (Eq.13b), an evaluation with an ensem-  ble has been performed.The results are displayed in Fig. 7, which shows, at t = T , the error variance and length-scale fields estimated with Eq. (29a, 29b, 29c) (dashed line), compared to the fields diagnosed from a large ensemble (continuous line, and also shown in Figs.5b and 6b).Then it shows the statistics computed from an ensemble of forecast of the tangent linear dynamics (Eq.14a, 14b) (small dashed line) (finite difference and RK4), and using the Reynolds equations (Eq.13a, 13b) (dashed-dotted line) (also finite difference and RK4).It appears that the statistics computed from the tangent linear dynamics are equivalent to the error variance and length-scale fields predicted by the parametric model, while the statistics from the Reynolds equations equal those deduced from the ensemble of nonlinear forecasts of Eq. ( 12).Hence, the difference is well explained by the contribution of the nonlinear term ε∂ x ε.
The case of the larger initial error magnitudes of σ 20 % and σ 50 % shows similar results to the magnitude σ 10 % case: the small length-scale area is captured by the PKF but with a spurious oscillation not present in the ensemble estimation (Fig. 6c, d), and the position of the beam of uncertainty is well predicted by the PKF but with a larger magnitude than the ensemble estimation (Fig. 5c, d).Since the magnitude of the variance predicted by the PKF seems to increase faster than the ensemble estimation, it is interesting to investigate what happens for a longer time window.

Long-term behaviour
The increase of the PKF variance prediction might be a side effect due to the tangent-linear-like derivation of the PKF which could fail to predict the saturation of the error magnitude.In order to tackle the long-term behaviour, a comparison is conducted with a longer time window of [0, 5T ].Since the location of the uncertainty beam is well predicted by the PKF, the comparison focuses on the magnitude of the variance fields' maximum.The time series of the variance maximum predicted by the PKF and estimated from the ensemble is shown in Fig. 8.The time evolution is equivalent for all the initial error magnitudes: after a short transition (where the variance decreases), two phases are seen, where the variance increases (phase 1) then saturates and decreases in the long term (phase 2).The time at which the maximum of variance is reached shifts with the magnitude of the initial error: it occurs after (before) the time T for σ 1 % (σ 50 % ).We associate the increase to the advection contribution that includes the source term −2(∂ x u)V in the variance dynamics (Eq.29b), while the damping is related to the diffusion.Thanks to the competition between the advection and the diffusion, the increase of the variance first saturates then decreases.The PKF reproduces the two phases with a magnitude prediction close to but different from the ensemble estimation.For the small initial error magnitude σ 1 % the PKF underestimates the variance for the long-term behaviour, while the variance is overestimated for larger initial error magnitudes.From numerical investigations with smaller error magnitude at the initial time, the PKF prediction of phase 2 appears more difficult than for phase 1.This could be related to the choice of closure made for the kurtosis: while the locally homogeneous Gaussian closure is in accordance with the one diagnosed from the ensemble, a heterogeneous closure might improve the results.Beyond these deficiencies, it is interesting to maintain that the theoretical derivation of the parametric model, which is partly based on the tangent linear assumption, is able to capture the main part of the uncertainty dynamics in the Burgers equation.

Discussion
From these results, we can conclude that the PKF forecast, as implemented by Eq. (29a, 29b, 29c), reproduces the tangent linear evolution of the statistics as given by the covariance forecast (Eq.3).Put differently, the PKF forecast reproduces the tangent linear covariance dynamics occurring in the extended Kalman filter.Since the PKF forecast model is deduced from a small error expansion, it is not meant to recover strongly nonlinear effects, which has been verified in the numerical experiment.However, even when the nonlinearity is stronger and the tangent linear assumption is invalid, the solution of the PKF still shares some features with the empirical ensemble statistics.This may not be the case anymore for the long-time integration of more complex geophysical dynamics.However, this suggests that some of the statistics could be predicted, at least for medium-range forecasts.

Conclusions
This study focused on the forecast step of the parametric Kalman filter (PKF) applied to the nonlinear dynamics of the diffusive Burgers equation.The parametric approach consists in approximating the error covariance matrix by a covariance model with evolving parameter fields.Here the covariance model considered is based on the diffusion equation, parameterized by the error variance and local diffusion fields.Hence, the forecast of the error covariance matrix, which is com-   putationally very demanding in real applications with highdimensional systems, amounts to the forecast of the error variance and local-diffusion fields, whose numerical cost is of the order of a single nonlinear forecast.In comparison, ensemble methods need dozens of members for the covariance forecast (which could be parallelized though), as well as localization to address the rank deficiency.The derivation of the PKF dynamics was first rigorously deduced from the dynamics of the perturbation under a small error magnitude assumption.However, a closure problem appears due to the physical diffusion process.This closure issue has been related to the fourth term in the Taylor expansion of the correlation function, the kurtosis, and a closure has been proposed based on a homogeneous Gaussian approximation for the kurtosis.
Numerical experiments in which the true covariance evolution has been diagnosed from an ensemble forecast were performed.First, a comparison with the PKF prediction showed the relevance of the closure, even for large error magnitudes.Moreover, these experiments have demonstrated the ability of the parametric formulation to reproduce the main features of the error dynamics when the tangent linear approximation is valid.When the tangent linear dynamics is not valid anymore, the PKF can only reproduce a part of the error statistics evolution, at least for mid-term forecasts.
This contribution is a step toward the PKF formulation of more complex dynamics in geophysics.From the present study, we learned the difficulties of handling the higher order derivatives, since the coupling between the error variance and diffusion fields has been due to the physical diffusion.The Gaussian closure, similar to the one introduced in the kurtosis' treatment, will be useful in providing prognostic dynamics.But we expect that the main difficulties will be encountered in the forecast of multivariate statistics that govern the balance between geophysical fields.Theoretically, the PKF formulation enables the forecast of covariance matrices in high dimensions.Hence, it might offer new theoretical tools to approximate and to investigate important aspects of the dynamics of errors, such as the unstable subspace of chaotic dynamics.These points will be investigated in further developments.

Figure 7 .
Figure7.Verification of the parametric variance (a) and correlation length scale (b) at t = T and for an initial perturbation standard deviation σ 0 10 % .The statistics from the ensemble of nonlinear forecasts of Eq. (12) are shown by the black line, the statistics predicted by the parametric model are shown by the blue dashed line, and the statistics from the Reynolds equations (Eq.13a, 13b) (from the fluctuation-mean flow dynamics shown in Eq. 14a and 14b respectively) with (without) the second-order term ε∂ x ε are shown by the small red dashed-dotted line (yellow dashed-dotted line).
vicinity of the front position, where the field is oscillating.For both times, it appears that the kurtosis' closure is able to reproduce the main behaviour of the true kurtosis, with a low relative error ||K − K GC ||/||K|| of 5.2 % (21.5 %) at time 0 (T ), where ||•|| stands for the L 2 norm.At time T , the maximum difference between the two normalized kurtosis is 0.05.