The ability of a data assimilation system to deal effectively with nonlinearities arising from the prognostic model or the relationship between the control variables and the available observations has received a lot of attention in theoretical studies based on very simplified test models. Less work has been done to quantify the importance of nonlinearities in operational, state-of-the-art global data assimilation systems. In this paper we analyse the nonlinear effects present in ECMWF 4D-Var and evaluate the ability of the incremental formulation to solve the nonlinear assimilation problem in a realistic NWP environment. We find that nonlinearities have increased over the years due to a combination of increased model resolution and the ever-growing importance of observations that are nonlinearly related to the state. Incremental 4D-Var is well suited for dealing with these nonlinear effects, but at the cost of increasing the number of outer loop relinearisations. We then discuss strategies for accommodating the increasing number of sequential outer loops in the tight schedules of operational global NWP.

The importance of nonlinear effects has been recognised since the early days of the development of 4D-Var (e.g. Gauthier, 1992; Rabier and Courtier, 1992; Miller et al., 1994; Pires et al., 1996). The presence of nonlinearities in either the model or the observations can potentially cause significant deviations from the usual Gaussian distribution assumed to describe observation and background errors in the definition of the 4D-Var cost function. This in turn translates into a more complex topology of the cost function and the potential for multiple minima (e.g. Pires et al., 1996; Hoteit, 2008). In these conditions, finding the global minimum of the 4D-Var cost function for realistic numerical weather prediction (NWP) applications becomes computationally unaffordable and, even if it were possible, the interpretation and usefulness of the result in the case of multi-modal error distributions become unclear in a deterministic analysis context (Lorenc and Payne, 2007).

In order to make the variational problem computationally tractable and mathematically well posed, simplifications are required. One idea would be to reduce the dimensionality of the control vector used in the minimisation, for example limiting it to the subspace where dynamical instabilities develop during the data assimilation cycle (Trevisan and Uboldi, 2004; Carrassi et al., 2008; Trevisan et al., 2010). Another approach starts from recognising that the use of a linear model and linear observation operators leads to strictly quadratic cost functions, which brings two major benefits: (a) it guarantees the convergence of the minimisation algorithm to the global minimum and (b) it allows the use of efficient, gradient-based iterative minimisation algorithms (Fisher, 1998). This consideration has spurred research in NWP applications of variational methods towards perturbative solution algorithms, where the full nonlinear minimisation problem is approximated as a series of quadratic cost functions obtained by repeated linearisations around progressively more accurate guess values of the solution. This idea, based on the general Gauss–Newton method for the solution of nonlinear least squares problems (Björck, 1996), was first introduced in the meteorological literature by Courtier, Thépaut and Hollingsworth (1994) (CTH in the following) as “Incremental 4D-Var”. In that paper the main stated objective of incremental 4D-Var was the reduction of the computational costs of full 4D-Var in order to make it feasible for operational application. Its ability to deal with weak nonlinearities was also noted and subsequently investigated in simplified models, particularly in relation to the length of the assimilation window and the global convergence properties of the algorithm (e.g. Tanguay et al., 1995; Laroche and Gauthier, 1998).

After the operational implementation of incremental 4D-Var at ECMWF (Rabier et al., 2000) and, later, in other major global NWP Centres (Kadowaki, 2005; Rosmond and Xu, 2006; Gauthier et al., 2007; Rawlins et al., 2007) the possibility arose to address in realistic NWP settings still open questions about the limits of applicability of 4D-Var in nonlinear situations. A series of studies (Andersson et al., 2005; Radnòti et al., 2005; Trémolet, 2004, 2007) conducted with the ECMWF Integrated Forecasting System (IFS) provided answers to some of these questions in the context of the ECMWF operational system of the time. These studies emphasised the importance of the consistency between the nonlinear and linearised evolution of the analysis increments during the assimilation window for the global convergence of the incremental 4D-Var. This, in turn, was shown to require the availability of accurate linearised models, and the need to run inner and outer loops with not too large discrepancies in terms of spatial resolution and time step (a ratio of three between the outer and inner loop resolutions was found to give satisfactory results). As there is no guarantee of global convergence of the incremental 4D-Var algorithm, the aforementioned studies also stressed the importance of regularly re-evaluating the nonlinearity issues in future operational systems.

From the time of these investigations, the operational ECMWF IFS has changed considerably. From the perspective of the validity of the linearity assumptions in the incremental formulation, two changes are particularly relevant: (a) the increase in resolution at both outer loop and inner loop level and (b) the introduction of a very large number of humidity, cloud and precipitation-sensitive satellite observations in the analysis system (Geer et al., 2017). In terms of spatial resolution, the effective grid spacing has gone from approx. 40 km (TL511, i.e. spectral triangular truncation 511 with a linear grid) to approx. 9 km (TCo1279, spectral triangular truncation 1279 with a cubic grid; see Malardel et al., 2016, for more details), for the 4D-Var outer loops, and from approx. 130 km (TL159) to approx. 50 km (TL399) for the inner loops. Thus, nonlinearities are expected to play a larger role in the current IFS, also in view of the fact that the ratio between the resolutions of the outer and inner loops of the minimisation has increased from approx. 3.2 to 5.5. In terms of observation usage, the increase in the number and influence of humidity, cloud and precipitation-sensitive observations can also be expected to expose nonlinear effects connected to the way their observation operators respond to forecasted humidity and precipitation structures. Some of these issues were already described at the time of the introduction of the “all-sky” framework for the assimilation of microwave imagers sensitive to humidity and precipitation (Bauer et al., 2010), but at that time the number and influence of these observation types on the 4D-Var analysis was relatively small. Currently, however, all-sky observations are one of the most important components of the observing system used operationally at ECMWF (Geer et al., 2017) and it is thus important to understand the capabilities and limitations of 4D-Var to deal with these type of nonlinearities.

Given the motivation above, the remainder of this paper is organised as follows. In Sect. 2, we briefly review the incremental 4D-Var algorithm in order to highlight the hypotheses underlying the tangent linear approximation and the mathematical basis of the outer loop iterations. In Sect. 3 evidence of nonlinear effects in current ECMWF 4D-Var is presented, from both an observational and a model perspective. In Sect. 4, we evaluate how effective incremental 4D-Var is in dealing with both observation and model nonlinearities. Section 5 addresses the question of how important the ability to run outer loops is in the current ECMWF data assimilation system, in terms of both analysis and forecast skill. These results and their implication for data assimilation strategy at ECMWF and elsewhere are discussed in Sect. 6.

The aim of variational data assimilation is to determine the model
trajectory that best fits in a least square sense the observations available
during a given time window. This concept naturally leads to the formulation
of the standard strong constraint 4D-Var cost function:

In the observation part of the cost function, the so-called “tangent linear
(TL) approximation” has been made in going from Eq. (1) to Eq. (3):

In the Taylor expansion in Eq. (4), terms of

Globally averaged profiles of historical ECMWF 4D-Var differences

Other, possibly less well-known, sources of nonlinearities in the ECMWF
incremental 4D-Var formulation stem from the variational quality control
(VarQC) of the observations and the nonlinear change of variable used for the
humidity analysis. The VarQC algorithm is based on the Huber norm (Tavolato
and Isaksen, 2015) and has the effect of making the observation error matrix

Model nonlinearities affect the 4D-Var solution in two main ways. First, the
more nonlinear the high-resolution trajectory solution is, the spatially
noisier the low-resolution interpolated linearisation state for the 4D-Var
inner loops becomes. This roughness of the interpolated trajectory increases
when differences between the time steps and resolutions of the inner loops
and the trajectory become larger. Second, the tangent linear evolution
differs more from the nonlinear solution as nonlinearities increase. One
measure of the degree of nonlinearity (Rabier and Courtier, 1992) is to take
the difference between the nonlinearly and linearly evolved increments in the
last minimisation,

Ensemble mean

The significance of nonlinearities in the observation operators can be
estimated using statistics from the Ensemble of Data Assimilations (EDA,
Isaksen et al., 2010) system which is run operationally at ECMWF. Each
ensemble member is initialised using a perturbed model state with
perturbations drawn from a distribution with zero mean. For
linear observation operators and Gaussian perturbations, the ensemble mean
of the model equivalents provided by the observation operators is expected
to be close to the unperturbed control member (in fact, it should match it
exactly in the limit of infinite ensemble size):

The incremental approach to 4D-Var (CTH, 1994) reduces the resolution of the inner loops to make the solution more affordable. Observation departures are calculated at high resolution and then the high-resolution trajectory is truncated and interpolated to the resolution of the inner loop for each time step of the low-resolution minimisation (Trémolet, 2004). At the end of the minimisation, the increments are projected back to the high resolution and added to the previous trajectory at the start of the assimilation window. This process is repeated for all minimisations, which can be at different resolutions, starting with the lowest resolution to capture the larger scales and increasing the resolution in later minimisations to extract more detailed information from the observations (Veerse and Thépaut, 1998).

Bauer et al. (2010) discussed how the difference in departures at the end of
each minimisation step, and those in the subsequent nonlinear trajectory step
(i.e.

Standard deviation of departures for AMSR-2 channel 10 in the nonlinear trajectories (circles) and at the end of the minimisation of the linearised cost function (triangles) for each outer loop of 4D-Var. Note how the nonlinear and linearised departure standard deviations should coincide in the linear case. The average background error standard deviation in observation space for this type of observation is 3.4 K. Results from a single cycle from the ECMWF operational assimilation system.

Figure 4 plots the correlation coefficient and standard deviation of these differences at each outer loop, demonstrating that nonlinearities become smaller at each successive outer loop. For “linear” observation types such as radiosonde temperature and AMSU-A channel 6, the nonlinearities are less significant than for ATMS channel 20 (which is sensitive to humidity).

As expected, the departures for observations sensitive to cloud and humidity show increased nonlinear impacts. Figure 5 shows results from AMSR-2 channel 11 categorised using estimates of cloudiness from both the observations and the model fields (Geer and Bauer, 2011). It can be seen that the linear assumption holds less well for observations in cloudy regions compared to those in areas of clear sky.

Taylor diagram showing the correlation (azimuthal angle) and standard deviation (distance from the origin) of the differences in the departures (K) between the nonlinear trajectory and linear minimisation steps for each outer loop. Results are shown for satellite brightness temperature observations (ASMU-A channel 6, ATMS channel 20) and radiosonde temperature observations.

Resolution and number of outer loop iterations for the sensitivity experiments discussed in Sect. 5. TCo399 means IFS model integrations with spectral triangular truncation 399 and a cubic octahedral reduced Gaussian grid. TLXXX mean IFS model integrations carried out at spectral triangular truncation XXX on a linear reduced Gaussian grid. All minimisations are performed using the full physics tangent linear and adjoint models.

A clear indicator of the success or otherwise of the incremental strategy is
the size of the analysis increments produced by the linearised cost
function (3) during successive outer loop iterations. For a well-behaved
incremental 4D-Var converging towards the solution of the nonlinear cost
function (1), successive analysis increments are expected to become smaller,
reflecting the hypothesis that successive first guess trajectories provide
increasingly accurate descriptions of the flow. This hypothesis is supported
by the experimental results shown in Fig. 6, where we present the vertical
profiles of the standard deviations of the analysis increments of vorticity
(left panel) and temperature (right panel) from a multi-incremental 4D-Var
experiment with five outer loops (in this experiment the outer loop
resolution is TCo399, approx. 30 km, and the inner loop resolutions are
TL95/TL159/TL255/TL255/TL255, approx. 210, 125, and 80 km; more details in
Table 1). The magnitude of the analysis increments is seen to gradually
decrease for successive outer loop iterations, more rapidly in the
stratosphere for vorticity. After five outer loop iterations, the magnitude
of the analysis increments appears to asymptote to a relatively small value
for temperature throughout the atmospheric column (

As Fig. 4 but showing results from AMSR-2 channel 11 categorising observations by those in clear-sky regions and those impacted by cloud.

Vertical profiles of the globally averaged standard deviation of the
analysis increments produced by successive outer loop iterations for
vorticity

Vertical profiles of the longitudinally averaged standard
deviation of the analysis increments produced at the end of the first outer
loop minimisation

An informative example of the effectiveness of incremental 4D-Var in dealing with nonlinear error evolution in active weather systems is described in the following test case of organised convection in the southern United States. These high-impact weather phenomena are particularly interesting from a data assimilation perspective because: (1) they have been shown to be potential precursors of significant forecast “busts” in downstream regions, Europe in particular (Rodwell et al., 2013); and (2) they occur in probably the most densely observed region of the world, thus allowing a more in-depth look into the ability of the assimilation system to make effective use of the observations.

In the case described here, large-scale organised convection with the satellite signature of a mesoscale convective complex (Fig. 8, left panel), was forming in the southern US coastal plains in the local evening hours of 3 May 2017, continuing for most of the night. The synoptic situation, as depicted by the ECMWF operational analysis (Fig. 8, right panel) is characteristic of this type of event (Maddox, 1980): a strong warm, moist southerly flow from the Gulf of Mexico is taking place in the lower troposphere, in the region ahead of an upper level trough. The combination of strong warm and moist air advection in the lower levels with vorticity advection aloft leads to a situation conducive to intense organised convection in a region along the Texas–Louisiana coast, starting at around 13:00 UTC on the 3 May 2017 and lasting until approx. 06:00 UTC on 4 May 2017. Forecasting the intensity and location of convection is notoriously difficult and the ECMWF analysis increments (Fig. 9) show that the operational 4D-Var makes significant changes to the first guess fields throughout the atmospheric column. In particular, the analysis appears to adjust the strength of the convective system through a significant cooling at the top of the troposphere and associated enhancement of the divergent wind field (Fig. 9, left panel). In the boundary layer (Fig. 9, right panel), the analysis increments show more spatial variability, but the main signal of localised warming and convergence of the wind field in the direction of movement of the convective system are apparent.

The magnitude of the analysis increments in the case studied here (up to

Standard deviation of wind vector observation minus model
departures over the 09:00 to 21:00 UTC assimilation window of 3 May 2017
in the
100–400 hPa layer for a pre-operational version of the IFS 43R3 cycle: first
guess departures

The diagnostics presented in Sect. 4 showed that increasing the number of
outer loop iterations in the ECMWF 4D-Var helps to reduce the magnitude of
nonlinearities in the analysis and suggests that it can lead to a better use
of available observations, in particular those that are nonlinearly related
to the model state. The next step is then to verify that these findings are
confirmed in a cycled data assimilation environment as close as is
computationally affordable to the operational ECMWF assimilation system. To
this end a series of data assimilation experiments has been run with a recent
ECMWF IFS cycle (cycle 43R3, operational from July 2017), in which only the
horizontal spatial resolution has been changed for both outer loops and inner
loop minimisations. The operational 4D-Var runs three outer loops at TCo 1279
resolution (approx. 9 km) and performs three inner loop minimisations at
TL255/TL319/TL399 resolution (approx.

A standard way to evaluate the skill of the analyses produced by a cycling
data assimilation system is to look at the statistics of observation minus
analysis (

A representative sample of

Normalised standard deviations of analysis

The only degradation in

Normalised root mean square forecast errors for geopotential at
500 hPa

The forecast skill scores show a high level of consistency with the analysis skill diagnostics. In Fig. 12 we present a selection of tropospheric forecast skill scores relevant for evaluating standard synoptic performance (500 hPa geopotential rms forecast error, top row), the water cycle (total column water vapour rms error, second row) and the wind field (200 and 850 hPa wind vector rms errors, third and bottom row). All the diagnostics confirm the significant degradation in performance for the one outer loop experiment and the small but statistically significant improvement of the four and five outer loop experiments with respect to the baseline three outer loop experiment. In the stratosphere (not shown) forecast skill scores again show degraded performance for the one outer loop experiment, while results are mostly neutral or slightly positive for the four and five outer loop experiments. One notable exception is the tropical stratospheric layer from 5 to 1 hPa, where the five outer loop experiment shows a statistically significant degradation, again confirming the analysis skill diagnostic results.

In Sect. 2 of this paper we have shown how nonlinear effects in 4D-Var arise
from two different sources: nonlinearities in the model evolution during the
assimilation window and nonlinearities in the observation operators. It is
difficult to cleanly disentangle the two effects, as they are linked inside
the generalised observation operator

Normalised root mean square forecast errors for geopotential at
500 hPa

In modern atmospheric data assimilation (and, arguably, in most of the other Earth system components as well) nonlinearities play an ever more important role. This is due to the ever-increasing resolution and complexity of the prognostic models, which exhibit instabilities at smaller scales and thus present faster nonlinear error growth during the assimilation window, and to the emergence of an array of observations that are nonlinearly related to the control vector variables used in the variational analyses. Both these trends are expected to continue in the near future, which makes the capacity of the assimilation algorithms to deal effectively with nonlinear effects an increasingly important benchmark.

The ECMWF implementation of 4D-Var relies on a perturbative approach to nonlinearity. Incremental 4D-Var is based on the concept of a purely linear analysis update iterated on ever more accurate first guess trajectories. Diagnostics in both observation space and model space support this interpretation and show that the capacity to run more than one outer loop is a significant driver of the overall ECMWF analysis and forecast skill. Results from long data assimilation cycling experiments show that running the current ECMWF 4D-Var with one outer loop only, which is equivalent to making a purely linear analysis update, would result in very significant deterioration in all analysis and forecast accuracy metrics. Conversely, adding one, or possibly two additional outer loops to the current operational set-up of three outer loop updates, appear beneficial both in terms of analysis quality and in terms of general forecast skill. Results from limited additional experimentation (not shown) also indicate that more than five outer loops do not appear to bring further benefits, at least in the experimental configuration we have used.

One interesting question is about the limits of applicability of the multi-incremental approach in the ECMWF data assimilation system. As noted in Sect. 5, while the tropospheric analyses and forecasts were consistently improved in the four and five outer loop assimilation experiments, signs of degradation started to appear in the analysis and first guess fit of some types of stratospheric peaking radiance observations. Interestingly, these degradations were not seen in the experiments using only observations which are linearly related to the state. This suggests that changes to the analysis introduced by the assimilation of nonlinear observations (mainly humidity and cloud and precipitation sensitive radiances) affect the stratospheric analysis either through the shape of the background error spatial correlations or by the generation of gravity wave structures in the initial conditions. Remarkably, the stratospheric degradation of the multi-outer loop experiments also disappeared in tests run with the full observing system with matching time steps for the outer and inner loop integrations. This result gives further support to the hypothesis that the representation in the 4D-Var analysis of stratospheric gravity waves excited by the assimilation of all-sky observations could be one of the main drivers of these effects. These interactions are currently being investigated.

Another obvious factor potentially limiting the applicability of the incremental algorithm is the range of validity of the tangent linear (TL) hypothesis (Sect. 2). As reported in Bonavita et al. (2017b), problems in 4D-Var convergence connected with the TL hypothesis usually arise in situations where the first guess departures are at least 1 order of magnitude larger than the assumed observation errors. In most cases, use of more realistic values of the observation errors, which better take into account the representativity and observation operator components, are sufficient to regularise the minimisation.

While the advantages of being able to run an increased number of outer loop linearisations are clear, the question remains on how to fit them inside the typically tight operational schedules of operational weather centres. Taking the ECMWF data assimilation system as an example, the three outer loops 4D-Var analysis has about 45 min to complete. Given the sequential nature of the 4D-Var minimisation, each additional outer loop would increase this time by approx. 15 min. This implies that, in the current set-up, the observation cut-off time would have to be pushed back by a similar time interval, quickly negating any advantage that the increased number of outer loops might bring. One possible way to overcome this problem would be to allow late arriving observations to enter the assimilation at successive outer loop updates. This would effectively push the observation cut-off time forward to the beginning of the last minimisation, thus allowing to start the 4D-Var analysis earlier and consequently accommodate additional outer loop updates. This assimilation framework, which we call “continuous DA”, is currently being tested at ECMWF and results will be documented in a forthcoming paper. Note that in the continuous DA the problem being solved is conceptually different from that of incremental 4D-Var. In incremental 4D-Var we solve a nonlinear problem through repeated linearisations. In the continuous DA we solve a sequence of slightly different nonlinear minimisation problems, taking advantage of increasingly accurate first guess trajectories.

Another possible approach to increase the number of outer loops within the operational time constraints is to adopt an “overlapping” assimilation window framework, for example along the lines discussed in Bonavita et al. (2017a). In this configuration, observations that have been assimilated in both successive overlapping windows will have effectively be seen by twice the number of guess trajectories as in a standard non-overlapping configuration. This idea, similar to the quasi-static variational DA approach of Pires et al. (1996) and Jarvinen et al. (1996), is also being actively investigated.

The datasets used in this work are available from ECMWF
under the terms and conditions specified at

All the authors have equally contributed to all parts of the paper and the work described in it.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Numerical modeling, predictability and data assimilation in weather, ocean and climate: A special issue honoring the legacy of Anna Trevisan (1946–2016)”. It is a result of a Symposium Honoring the Legacy of Anna Trevisan – Bologna, Italy, 17–20 October 2017.

One of the authors (Massimo Bonavita) would like to express his gratitude to the organisers of the Anna Trevisan Symposium, and to Alberto Carrassi in particular, for setting up a very successful and interesting meeting and for being such attentive and considerate hosts.

The authors would also like to thank the reviewers and the editor for their careful examination of our manuscript and the many constructive proposals for its improvement.Edited by: Michael Ghil Reviewed by: four anonymous referees