A variational formulation for translation and assimilation of coherent structures

The assimilation of observations from teledetected images in geophysical models requires one to develop algorithms that would account for the existence of coherent structures. In the context of variational data assimilation, a method is proposed to allow the background to be translated so as to fit structure positions deduced from images. Translation occurs as a first step before assimilating all the observations using a classical assimilation procedure with specific covariances for the translated background. A simple validation is proposed using a dynamical system based on the onedimensional complex Ginzburg–Landau equation in a regime prone to phase and amplitude errors. Assimilation of observations after background translation leads to better scores and a better representation of extremas than the method without translation.


Introduction
Numerical prediction of geophysical flows (meteorology, oceanography, hydrology, etc.) requires an analysis procedure. Its purpose is to obtain an optimal initial state at a given instant from which the forecast is computed. Such an analysis is generally provided by correcting a previous forecast (the background) by observations: this is the assimilation step of observations. Most data assimilation algorithms rely on the best linear unbiased estimation (BLUE, Talagrand, 1997), which is a statistical estimator that requires the prior knowledge of the bias and variance of the errors affecting the input data. The BLUE achieves Bayesian estimation if the distribution of errors, taken as a whole, is Gaussian. The BLUE is the background for powerful algorithms, such as sequential Kalman (1960) filtering and variational assimilation (Le Dimet and Talagrand, 1986;Talagrand and Courtier, 1987).
Conventional observations and satellite radiances defined as punctual values are the essential input data for such analysis. Observation errors are usually presumed to be uncorrelated. As a consequence, the assimilated pixels from a satellite image are undersampled so that to guarantee that their errors are enough uncorrelated. Still, geophysical flows often contain coherent structures that teledetected images may exhibit, as for example in the atmosphere for tropical cyclones or midlatitude storms (Plu et al., 2008). Consistent with this statement, Hoffman et al. (1995) proposed to separate meteorological forecast errors into displacement error, amplitude error and residual error. However, the position, size and shape of structures cannot be assimilated correctly by algorithms derived from the BLUE (Titaud et al., 2010;Michel, 2011). A reason, among others, for this deficiency is that error distributions in flows with finite-amplitude coherent structures diverge from Gaussianity (Beezley and Mandel, 2008).
Some studies have investigated possible means of assimilating features from satellite images. Bogussing is such a technique: pseudo-observations deduced from an observed coherent structure are assimilated as conventional observations (wind, humidity, etc.). The assumptions that link the image to the pseudo-observation are often simplified (Heming et al., 1995;Michel, 2011;Montroty et al., 2008), and such methods are not fully objective. Michel (2011) concluded about the severe limitations of bogussing.
More refined methods have been proposed to assimilate features from coherent structures, mostly in the context of ensemble Kalman filters. Chen and Snyder (2007) proposed to assimilate directly the position of a tropical cyclone. It gives good results as long as the position error in the Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.
background remains small. Ravela et al. (2007) developed a two-step method in which the background is first aligned to an observed image, before treating amplitude errors using a conventional ensemble Kalman filter. Beezley and Mandel (2008) inserted a morphing analysis step between sequential Kalman filter steps, in order to fit a model state to observed images.
These promising studies emphasize the need to develop assimilation methods that would treat properly the errors associated with coherent structures. In theory, the most satisfactory approach would be to relax the Gaussian assumption and to develop fully Bayesian estimators. But, assimilation techniques must remain simple in order to deal effectively with numerical models that have very high degrees of freedom. A reasonable approach is to try to adapt the existing assimilation procedures (Kalman filter and variational) to take displacement errors into account. The present article proposes and tests a method to translate the background in the context of variational assimilation.
In the first section, the concept of background translation is formulated and a method of resolution is proposed. In the second section, a validation is provided for a onedimensional numerical system prone to phase errors. Possible extension to a realistic context is then discussed before the conclusion.

Notations and general formulas for data assimilation
The model state vector is noted ..N and a set of p observations y o = (y o j ) j =1...p are known. If background errors and observation errors are supposed to be uncorrelated, and if they are known up to second-order statistical moments, the optimal analysis x a may be obtained as: where H is the observation operator and B and R are, respectively, the background covariance error and observation covariance error matrices. BH T (HBH T +R) −1 = K = (k i,j ) is the gain matrix. This is the equation used in the analysis step of sequential Kalman filtering. The variational form of Eq. (1) consists in minimizing the quadratic cost function

A variational approach for translating structures in the background
The preceding equations will be adapted here to allow coherent structures in the background field to be translated onto the corresponding observed structures. Let us define a local translation as the translation of a finite-length segment. What is sought is the sum of local translations that would make coherent structures in the background fit the observed structures. A first hypothesis is to build a transformed background by applying a surjective function to indices: where t is an integer function The function t is not given a priori. It is a supplementary degree of freedom that must be estimated by the method. A local translation corresponds to a constant value for t over a segment and 0 elsewhere. In the general context of geophysical background fields, such a translation is not satisfactory since it may generate a discontinuous transformed background field x bt . The function t should therefore be a compromise between the following constraints: -the position of every coherent structure in the transformed background x bt should match the position of a coherent structure in the observed image, t should be the sum of local translations, -the transformed background x bt should be smooth.
It is assumed that these constraints may be accounted for by minimizing a cost function J t . The extension of variational assimilation (Eq. 2) to allow background translation may thus be resolved by minimizing the cost function where B t would depend on the translation (Ravela et al., 2007). The background is allowed to be translated in the J b term, which is expected to make sense wherever there is inconsistency between a structure position in the observations and in the background. Although several definitions could be possible, the J t (t) term is proposed as: where w 1 and w 2 are positive parameters. The w 1 term of Eq. (5) has three desirable properties: -the cost of a local translation increases as the translation value increases, thus giving a higher probability to the lower values of translations and avoiding unlimited translation values, -it is differentiable.
The w 2 term reaches low values for local translations. Moreover, it gives a high cost to the irregular t function, thus smoothing the transformed background x bt . As a consequence, Eq. (5) provides a cost function to account for the desirable constraints of function t. Although Eq. (4) seems to be a reasonable formula to account for structure position errors in the background, its numerical resolution is far from being obvious. J is not a quadratic function of (x, t): the existence of a unique minimum is not sure. However, if the transformation t is fixed, J is a quadratic function of x. A two-step method is thus proposed to get an optimal (x a , t a ): J is minimized first along t so as to get a transformation t a , and then x a is obtained by minimizing J for this given field t a .
Step 1. An optimal t a is searched, such as minimizing The purpose of this step is to obtain a satisfactory t function as a compromise between fitting the transformed background x bt to the observed image y o , limiting the amplitude of translations and leading to a smooth transformed background x bt . The observations taken into account in step 1 may be restricted to the images according to which the background coherent structures are wished to be corrected.
Since F is a discrete function with positive values, it admits a global minimum value for one or several transformation vectors t. Minimization could simply be achieved by spanning all the possible t values, which would be a perfect but inefficient method. Another approach is to find a local minimum using a solver for minimization, like the one from Gilbert and Lemaréchal (1989). For this purpose, t is allowed to have real values. It is also necessary that F (Eq. 6) is continuous and differentiable, which is guaranteed if the term J o (x bt ) is computed using cubic interpolations. If F admits several local minimas, the minimization solver may find different solutions, depending on the first guess vector t that initiates the solver. To reach a reasonable solution, and in accordance with the purpose of background translation, a first guess t used for minimization is automatically defined. It is a zero vector except for the points where Hx b reaches the position of the centre of a coherent structure: at these points the transformation connects to the centre of the nearest coherent structure in the observation. After minimization, the values t (i) are rounded to integer values. This method is general enough to be applied to many geophysical contexts, provided there is the possibility of identifying coherent structures in M. Plu: A variational formulation for translation and assimilation of coherent structures 3 The w 2 -term reaches low values for local translations. Moreover it gives a high cost to irregular t function, thus smoothing the transformed background x bt . As a consequence, Eq. 5 provides a cost function to account for the desirable constrains of function t.

155
Although Eq. 4 seems to be a reasonable formula to account for structure position errors in the background, its numerical resolution is far from being obvious. J is not a quadratic function of (x,t): the existence of a unique minimum is not sure. However, if the transformation t is fixed, J 160 is a quadratic function of x. A two-step method is thus proposed to get an optimal (x a ,t a ): J is minimized first along t so as to get a transformation t a , and then x a is obtained by minimizing J for this given field t a .
Step 1. An optimal t a is searched such as minimizing The purpose of this step is to obtain a satisfactory t function as a compromise between fitting the transformed background x bt to the observed image y o , limiting the amplitude of translations, and leading to a smooth transformed back-170 ground x bt . The observations taken into account in step 1 may be restricted to the images according to which the background coherent structures are wished to be corrected.
Since F is a discrete function with positive values, it admits a global minimum value for one or several transforma-175 tion vectors t. Minimization could be simply be achieved by spanning all the possible t values, which would be a perfect but inefficient method. Another approach is to find a local minimum using a solver for minimization, like the one from Gilbert and Lemarchal (1989). For this purpose, t is allowed 180 to have real values. It is also necessary that F (Eq. 6) is continuous and differentiable, which is guaranteed if the term J o (x bt ) is computed using cubic interpolations. If F admits several local minimas, the minimization solver may find different solutions depending on the first guess vector t that 185 initiates the solver. To reach a reasonable solution and in accordance with the purpose of background translation, a first guess t used for minimization is automatically defined. It is a zero-vector except for the points where Hx b reaches the position of the centre of a coherent structure: in these points the 190 transformation connects to the centre of the nearest coherent structure in the observation. After minimization, the values t(i) are rounded to integer values. This method is general enough to be applied to many geophysical contexts, provided there is the possibility to identify coherent structures in mod-195 eled and observed fields. Fig. 1 shows an idealized result of step 1, defining the centre of the coherent structure as a local maximum. Translation applies so that the tranformed background fits well to the observation. The method seems to be satisfactory and in particular, the transformed background is 200 reasonably smooth. In such an idealized context, the solution should be a uniform translation, for which the cost function given in Eq. 6 is null. The non-perfect result in Fig. 1 reveals some minor defects of the method of resolution, due to non-convexity of Eq. 6 and to the fact that the result of 205 minimization depends on the input vector.
The results should depend on the parameters w 1 and w 2 and it is important to define and calibrate them properly. The parameter σ t is introduced, which is a typical value for acceptable translation of the background. This value may be 210 defined by the user or derived from background statistics (see implementation further). Since the first term of Eq. 5 should constrain the amplitude of local translations, the parameter w 1 should be linked to σ t . Consider a single local translation whose value is σ t over a segment of a given size, and 215 that, at the boundaries of this segment, t decreases one by one towards 0. The cost function of such a local translation is twice the sum of integers from 0 to σ t , which equals w 1 σ t (σ t + 1)(2σ t + 1)/3. For practical purpose, this expression is approximated as 2w 1 σ 3 t /3. To calibrate properly the 220 different terms of F (Eq. 6), the expression w 1 = 3/2σ −3 t is obtained. Then w 2 is adjusted so as to obtain reasonably smooth solutions.
Step 2. Given the vector t a from step 1, the analysis x a is then obtained by minimizing J (x,t a ) along x. Accord-225 ing to Eq. (4), B t is the error covariance matrix between transformed background. B t may be static or it may depend modelled and observed fields. Figure 1 shows an idealized result of step 1, defining the centre of the coherent structure as a local maximum. Translation applies, so that the tranformed background fits well to the observation. The method seems to be satisfactory and, in particular, the transformed background is reasonably smooth. In such an idealized context, the solution should be a uniform translation for which the cost function given in Eq. (6) is null. The non-perfect result in Fig. 1 reveals some minor defects in the method of resolution, due to the non-convexity of Eq. (6) and the fact that the result of minimization depends on the input vector.
The results should depend on the parameters w 1 and w 2 , and it is important to define and calibrate them properly. The parameter σ t is introduced, which is a typical value for acceptable translation of the background. This value may be defined by the user or derived from background statistics (see implementation further). Since the first term of Eq. (5) should constrain the amplitude of local translations, the parameter w 1 should be linked to σ t . Consider a single local translation whose value is σ t over a segment of a given size, and that, at the boundaries of this segment, t decreases one by one towards 0. The cost function of such a local translation is twice the sum of integers from 0 to σ t , which equals w 1 σ t (σ t + 1)(2σ t + 1)/3. For practical purposes this expression is approximated as 2w 1 σ 3 t /3. To calibrate properly the different terms of F (Eq. 6), the expression w 1 = 3/2σ −3 t is obtained. Then w 2 is adjusted so as to obtain reasonably smooth solutions.
www.nonlin-processes-geophys.net/20/793/2013/ Nonlin. Processes Geophys., 20, 793-801, 2013 Step 2. Given the vector t a from step 1, the analysis x a is then obtained by minimizing J (x, t a ) along x. According to Eq. (4), B t is the error covariance matrix between transformed background. B t may be static or it may depend on the transformation computed in step 1. How to compute a tdependent B t is not obvious (Ravela et al., 2007) and poses numerical issues. It is thus assumed that B t = B , like B, is static. If the error is supposed to be the sum of displacement and amplitude errors (Hoffman et al., 1995), B should represent the covariance of amplitude errors. In other words, B may be obtained after eliminating the part of error due to displacement errors of coherent structures. A possible method to compute the B matrix would be similar to the one for B. The most common method uses an ensemble of coupled forecasts valid at the same time (Parrish and Derber, 1992;Pereira and Berre, 2006), starting from different initial conditions. B is built from the covariance of the differences between the coupled forecasts. To compute B , one of the coupled forecasts is translated towards the other one using the algorithm of step 1, thus attempting to remove displacement errors of coherent structures. The same algorithm as in step 1 should be used to identify the position of coherent structures in forecasts. B is built from the covariance of the differences between the translated forecast and the other one.
Assuming B t = B as static and K the associated gain matrix, step 2 is also equivalent to the direct formula: Like the one proposed by Ravela et al. (2007), this twostep method attempts to first fit the background to observations and then to apply a classical assimilation procedure. The equations here apply to a variational context and the method of resolution seems to be simpler than the one of Ravela et al. (2007). The cost function J does not necessarily admit a unique minimum and the two-step procedure is not proven to lead to a local minimum of J . However, this two-step procedure leads to a unique solution, and arguments have been provided that it should not be far from an optimal one. In order to be more confident about the method, a validation is now proposed.

Validation on a one-dimensional system
The method will be applied to a one-dimensional dynamical system, in order to prove the concept of background translation and to reveal some possible limitations. Such a onedimensional system does not provide images, which are twodimensional by definition. However, the extreme values of the wave packets that evolve in the one-dimensional system may play a similar role as the coherent structures seen in satellite images. The main reason for restricting to one dimension is that the assimilation procedure (spatial correlations) is highly simplified and cost-effective.

Dynamical system
The one-dimensional system is the complex Ginzburg-Landau equation. For some relevant parameters, this weakly nonlinear system simulates coherent structures and is sensitive to initial conditions. The evolution of the complex function u(z, t) on a periodic segment is given by: The horizontal dimension z and time t are, respectively, expressed in m and s. The stability and the chaotic properties of the system depend on the parameters α and β, which are chosen as α = 2 and β = −1.5 in all the following experiments. Such a regime is absolutely unstable (Weber et al., 1992), with a Lyapunov exponent 1.6 × 10 −2 s −1 equivalent to a doubling time of small perturbations around 45 s. Numerical integrations confirm the sensitivity to initial conditions of phase and amplitude of the traveling wave packets. Such a model provides errors of position and amplitude of coherent structures that will be suitable for testing various assimilation algorithms. In such model fields, the centres of coherent structures are simply identified as the local maximas and the local minimas.
The equation is integrated over an N = 512-point periodic segment of L = 100 m length. Numerical integration relies on exponentional time differencing of second order applied to the Fourier coefficients of u. Let us define the nonlinear term of Eq. (8) as the function G(z, t) = (1 + iβ)|u(z, t)| 2 u(z, t), andũ k (t) (respG k (t)) the Fourier coefficients of u(z, t) (resp (G(z, t)) at instant t. The method for time integration for each index k is: where q k = 1 − (1 + iα)(2π k/L) 2 , and the time step is t = 0.05 s.

Error statistics
Although experiments rely on a highly idealized model, the general context and assumptions for data assimilation resemble those of realistic prediction systems. Thus the time between two assimilation instants is fixed. For the chosen dynamical system, it is taken as 100 s, which corresponds to a sufficient time for error to grow. Depending on the assimilation experiments, background error covariance matrices are supposed to be diagonal or not. It has been verified that statistics do not depend much on the amplitude of the initial white-noise. In addition, the typical 325 value for translation σ t is deduced from the distribution of difference of positions between the closest local maxima in the coupled forecasts (Fig. 2). Its standard deviation yields the uniform parameter σ t = 10 (gridpoint unit) used in the following assimilation experiments.

330
The observation error covariance matrix R is supposed to be diagonal (no spatial correlation) and uniform.

Validation
A nature run is computed using the model configuration described previously. 1001 nature state vectors are thus ob-335 tained, one every 100 s. At each instant, -an observation state vector is computed as the nature perturbed by some random small-amplitude noise, this observation is assumed to be both the image towards which background may be translated (step 1) and the 340 observation vector (with or without undersampling) that will be assimilated (step 2), -a model background is obtained as the 100 s model integration from the previous instant, starting from the nature state vector perturbed by some random small-345 amplitude noise.
The nature run serves as a reference from which error values are computed. It follows from this very simple system that the observation operator H is a unity matrix. Assimilation of observations in the background for the 1000 instants is done 350 using different methods. The experiments with background translation use the step-1 algorithm described in section 2.2, with w 1 = 1.5 10 −3 (corresponding to σ t = 10) and w = 4 (for sufficient smoothness deduced from tests like in Fig. 1). Eight experiments (Tab. 1) are compared by measuring a 355 score, as the r.m.s of analysis error (difference between the assimilated and the nature state vectors over the 1000 cases).
Since translations do not apply at the boundaries (Eq. 3), only the points at 32 gridpoints inside the segment are used for score computation. The ability of the experiments to re-360 produce the coherent structures (extrema) of the nature run ( Fig. 4) is also compared. The B matrix may be diagonal, or spatial correlations may be taken into account (nondiagonal). Another option is whether the whole observation state vector is assimilated or whether it is undersampled (1 365 point over 5) in step 2, in order to mimic the classical undersampling of satellite images. Fig. 3 shows the distribution of background error before and after translation, deduced from the 1000-instants sample. The translation step reduces significantly the standard 370 deviation (0.17 instead of 0.26), which is consistent with the reduction of displacement error discussed in section 3.2. The distribution error is not initially Gaussian, with skewness −0.045 and kurtosis 4.21 (a Gaussian distribution has skewness 0 and kurtosis 3). After translation, skewness is −0.16 375 and kurtosis is 6.3, which means that the Gaussianity of the background error distribution after translation is slightly degraded. Although it has been wished that translation would improve Gaussianity of background error in order to better apply the BLUE, this result does not prevent from testing as- 380 similation.
An attempt has also been done to iterate several times the two-step method, using the ouput (x a ,t a ) as input of a second processing of steps 1 and 2. The results over the 1000 test cases was that translation was rarely modified after an-385 other step 1 and, if it changed, it was only a few grid points. No significant improvement may thus be expected from further iterations of the two-step method.
Tab. 1 and Fig. 4 illustrate the method and compare it to classical variational assimilation. For every experiments, the 390 scores (r.m.s of analysis error) remain between the r.m.s of observation error (0.034) and the r.m.s of background error (0.26), which means that the assimilation procedure is suboptimal, probably due to non-linearities of the Ginzburg-Landau system. If two equivalent experiments are compared, 395 one with and one without translation, the one with transla- The background error covariances are obtained from a sample of 10 000 independent forecasts at lead time 100 s. For each initial state, a coupled initial state is obtained after addition of small-amplitude white noise and a coupled forecast after simulation of this initial state. For each forecast, a translated coupled forecast is obtained after translation of the coupled forecast towards the unperturbed forecast, using the translation approach described in step 1. The background error covariances are supposed to be homogeneous over the domain. The B matrix is computed from covariance of the difference between the 10 000 coupled forecasts. The B matrix is computed from covariance of the difference between the 10 000 translated coupled forecasts and the original ones. The local variance changes significantly after translation: while it is 0.070 for B, it gets down to 0.043 for B . The variance of the translated background error is much lower than the original variance of background error, reflecting the reduction, if not removal, of displacement error.
Depending on the assimilation experiments, background error covariance matrices are supposed to be diagonal or not. It has been verified that statistics do not depend much on the amplitude of the initial white noise. In addition, the typical value for translation σ t is deduced from the distribution of difference of positions between the closest local maxima in the coupled forecasts (Fig. 2). Its standard deviation yields the uniform parameter σ t = 10 (gridpoint unit) used in the following assimilation experiments.
The observation error covariance matrix R is supposed to be diagonal (no spatial correlation) and uniform.

Validation
A nature run is computed using the model configuration described previously. 1001 nature state vectors are thus obtained, one every 100 s. At each instant, -an observation state vector is computed as the nature perturbed by some random small-amplitude noise. This observation is assumed to be both the image towards which a background may be translated (step 1) and the observation vector (with or without undersampling) that will be assimilated (step 2), -a model background is obtained as the 100 s model integration from the previous instant, starting from the nature state vector perturbed by some random smallamplitude noise.
The nature run serves as a reference from which error values are computed. It follows from this very simple system that the observation operator H is a unity matrix. Assimilation of observations in the background for the 1000 instants is done using different methods. The experiments with background translation use the step-1 algorithm described in Sect. 2.2, with w 1 = 1.5 × 10 −3 (corresponding to σ t = 10) and w = 4 (for sufficient smoothness deduced from tests like in Fig. 1). Eight experiments (Table 1) are compared by measuring a score as the r.m.s. of analysis error (difference between the assimilated and the nature state vectors over the 1000 cases). Since translations do not apply at the boundaries (Eq. 3), only the points at 32 gridpoints inside the segment are used for score computation. The ability of the experiments to reproduce the coherent structures (extrema) of the nature run (Fig. 4) is also compared. The B matrix may be diagonal, or spatial correlations may be taken into account (non-diagonal). Another option is whether the whole observation state vector is assimilated or whether it is undersampled (1 point over 5) in step 2, in order to mimic the classical undersampling of satellite images. Figure 3 shows the distribution of background error before and after translation, deduced from the 1000-instants sample. The translation step reduces significantly the standard deviation (0.17 instead of 0.26), which is consistent with the reduction in displacement error discussed in Sect. 3.2. The distribution error is not initially Gaussian, with skewness −0.045 and kurtosis 4.21 (a Gaussian distribution has skewness 0 and kurtosis 3). After translation, skewness is −0.16 and kurtosis is 6.3, which means that the Gaussianity of the background error distribution after translation is slightly degraded. Although it has been wished that translation would improve the Gaussianity of background error in order to better apply the BLUE, this result does not prevent one from testing assimilation.
An attempt has also been made to iterate several times the two-step method, using the ouput (x a , t a ) as the input of a second processing of steps 1 and 2. The results over the 1000 test cases was that translation was rarely modified after another step 1 and, if it changed, it was only by a few grid points. No significant improvement may thus be expected from further iterations of the two-step method. Table 1   tion has better scores than the one without translation, and it better represents the extremas. The example of translated background (Fig. 4) confirms the results shown in Fig. 1: the extremas in the translated background fit well the ones from 400 the observation, while preserving smoothness. The analysis after background translation seems to be less smooth than without, but the amplitude of the irregularities remain small. The experiments with and without background translation will now be compared in detail. The experiments exp1 and 405 exp3 are designed with the most complete assumptions: horizontal correlations are full (B and B ′ are non-diagonal) and every observed points are assimilated. Still, the score (Tab. 1) is improved in exp3 (with background translation) compared to exp1 (without background translation) and most of the ex-410 tremas and slopes (upper panels of Fig. 4) are better represented using background translation. The experiments exp2, exp4, exp5 and exp7 are closer to classical geophysical models, either the observations are undersampled before assimilation or the background error covariance matrices are di-415 agonal. Tab. 1 shows that each experiment with background translation (exp4 and exp7 resp.) perform better than the corresponding one without (exp2 and exp5 resp.). In particular, the extremas in exp7 (Fig. 4, bottom-right panel) are far better reproduced than in exp5 (Fig. 4, bottom-left panel), and 420 nearly as good as in exp3 (Fig. 4, upper-right panel). Background translation without horizontal correlations performs nearly as well as a classical assimilation with background translation. To some extent, this result suggests that background translation corrects position errors, and amplitude er-425 rors may be corrected without spatial correlations. Fig. 5 shows another example, for which the local translations to be sought are positive at some points (peak around abscissa 160) and negative at some other points (peak around abscissa 250). In such a case, the transformed background 430 (left panels of Fig. 5, dashed lines) correctly fits the peaks from the background to the observations. The resulting assimilation procedure leads to improved results compared to assimilation without background translation (Fig. 5).

orological model
The method that has been described is sufficiently general to be applied on many geophysical models. The purpose of this section is to present and discuss how the method could be applied to a specific variational meteorological assimilation 440 scheme. It is assumed that an algorithm is available to detect coherent structures in images and in numerical model outputs (Plu et al., 2008;Michel, 2011).
Coherent structures in meteorology may be observed in two-dimensional fields, such as satellite or radar images. But 445 meteorological models are three-dimensional. It is thus sufficient to let translation to be two-dimensional: translated background points are not allowed to go through vertical levels. The translation fields are computed by minimizing Eq. 6. scores (r.m.s. of analysis error) remain between the r.m.s. of observation error (0.034) and the r.m.s. of background error (0.26), which means that the assimilation procedure is suboptimal, probably due to nonlinearities of the Ginzburg-Landau system. If two equivalent experiments are compared, one with and one without translation, the one with translation has better scores than the one without translation, and it better represents the extremas. The example of translated background (Fig. 4) confirms the results shown in Fig. 1: the extremas in the translated background fit well with the ones from the observation, while preserving smoothness. The analysis after background translation seems to be less smooth than without, but the amplitude of the irregularities remains small.
The experiments with and without background translation will now be compared in detail. The experiments exp1 and exp3 are designed with the most complete assumptions: horizontal correlations are full (B and B are non-diagonal) and all observed points are assimilated. Still, the score (Table 1) is improved in exp3 (with background translation) compared to exp1 (without background translation), and most of the extremas and slopes (upper panels of Fig. 4) are better represented using background translation. The experiments exp2, exp4, exp5 and exp7 are closer to classical geophysical models: either the observations are undersampled before assimilation, or the background error covariance matrices are diagonal. Table 1 shows that each experiment with background translation (exp4 and exp7, respectively) performs better than the corresponding one without (exp2 and exp5, respectively). In particular, the extremas in exp7 (Fig. 4, bottom-right panel) are far better reproduced than in exp5 (Fig. 4,, and nearly as well as in exp3 (Fig. 4,. Background translation without horizontal correlations performs nearly as well as a classical assimilation with background translation. To some extent, this result suggests that background translation corrects position errors, and amplitude errors may be corrected without spatial correlations. Figure 5 shows another example, for which the local translations to be sought are positive at some points (peak around abscissa 160) and negative at some other points (peak around abscissa 250). In such a case, the transformed background (left panels of Fig. 5, dashed lines) correctly fits the peaks from the background to the observations. The resulting assimilation procedure leads to improved results compared to assimilation without background translation (Fig. 5).

Discussion on the possible implementation in a meteorological model
The method that has been described is sufficiently general to be applied in many geophysical models. The purpose of this section is to present and discuss how the method could be applied to a specific variational meteorological assimilation scheme. It is assumed that an algorithm is available to detect coherent structures in images and in numerical model outputs (Plu et al., 2008;Michel, 2011). Coherent structures in meteorology may be observed in two-dimensional fields such as satellite or radar images. Meteorological models are however three-dimensional. It is thus sufficient to let translation be two-dimensional: translated background points are not allowed to go through vertical levels. The translation fields are computed by minimizing Eq. (6). Geophys., 20, 793-801, 2013 www.nonlin-processes-geophys.net/20/793/2013/ Fig. 4. Examples of assimilation for six relevant experiments described in Tab. 1. The only difference between the panels on the same line is that translation is (resp. not) applied on the right (resp. left). The nature state (solid black curve) and the background (solid light grey curve) are the same in each panel. The translated background is the dashed curve when relevant (left panels). The results of assimilation are plotted in solid dark grey. Fig. 4. Examples of assimilation for six relevant experiments described in Table 1. The only difference between the panels on the same line is that translation is (resp. not) applied on the right (resp. left). The nature state (solid black curve) and the background (solid light grey curve) are the same in each panel. The translated background is the dashed curve when relevant (right panels). The results of assimilation are plotted in solid dark grey.

Nonlin. Processes
The main issue is how to compute the typical position error σ t of background coherent structures (that leads to the parameter w 1 ) and the background covariance matrix B . A common formulation of background error covariance is from Derber and Bouttier (1999), in which cross-parameter correlations are expressed from vorticity. Covariances are computed from an ensemble of coupled forecasts valid at the same instant. For every forecast, two-dimensional coherent structures at every model level are identified in the relative vorticity using the above-mentioned algorithm. The typical position error σ t of coherent structures at every level would be obtained by the distribution of the difference of positions of detected structures in the coupled forecasts. For each coupled forecast, its coherent structures in the vorticity field would be translated towards the corresponding coherent structures in the other forecast. At each level, the resulting 8 M. Plu: A variational formulation for translation and assimilation of coherent structures The main issue is how to compute the typical position er-450 ror σ t of background coherent structures (that leads to the parameter w 1 ) and the background covariance matrix B ′ . A common formulation of background error covariance is from Derber and Bouttier (1999), in which cross-parameter correlations are expressed from vorticity. Covariances are com-455 puted from an ensemble of coupled forecasts valid at the same instant. For every forecast, two-dimensional coherent structures at every model level are identified in the relative vorticity using the above-mentionned algorithm. The typical position error σ t of coherent structures at every level ing translations would be applied to every field in the same forecast, and the resulting covariances between the translated forecast and the other one would lead to the B ′ matrix. It is expected that this method would change horizontal and vertical correlations but it would preserve cross-parameter cor-470 relations.
Using these statistics and the algorithm for identification of coherent structures, step 1 and step 2 would apply to any background. One of the advantage of background translation is that the selection of observations to be assimilated 475 would be improved. Since such a procedure relies on keeping the observations that do not depart too much from the background, good observations would have a better chance if they are compared to the translated background.

480
A method to translate and assimilate coherent structures in the context of variational data assimilation has been intro- translations would be applied to every field in the same forecast, and the resulting covariances between the translated forecast and the other one would lead to the B matrix. It is expected that this method would change horizontal and vertical correlations, but would preserve cross-parameter correlations.
Using these statistics and the algorithm for identification of coherent structures, step 1 and step 2 would apply to any background. One of the advantages of background translation is that the selection of observations to be assimilated would be improved. Since such a procedure relies on keeping the observations that do not depart too much from the background, good observations would have a better chance if they are compared to the translated background.

Conclusions
A method to translate and assimilate coherent structures in the context of variational data assimilation has been introduced. Application to the more general problem of fusion of geophysical data is also promising in the case of a data source that is prone to phase error. A simple and robust two-step algorithm is provided to compute the translation field and the analysis. Validation of the method is provided in a one-dimensional system, but extension of the method to three-dimensional geophysical fields and even to a fourdimensional context (4D-Var) is possible. Application to a model vector state in the wavelet domain may be highly valuable, since wavelet space is compatible with the representation of coherent structures (Plu et al., 2008). Moreover, some studies have shown the advantage of formulating data assimilation in a wavelet space (Deckmyn and Berre, 2005).
Adapting the algorithm to realistic operational variational systems would require further work, but the benefit is expected to be high in flows where coherent structures (vortices, convective cells, etc.) exist and may be observed. A strength of the method is that its additional cost depends on the number of translations. If translation is applied only to a small part of the domain (for instance, where a phasing error is obvious and could generate rapid error growth), its application could be operationally acceptable. Testing the method on a well-identified coherent structure in an operational meteorological model would be the following stage. Geophys., 20, 793-801, 2013 www.nonlin-processes-geophys.net/20/793/2013/