The stochastic multiplicative cascade structure of deterministic numerical models of the atmosphere

By direct statistical analysis of model analyses and output, we show that over almost all their range of scales and to within better than ±1.5% atmospheric fields obtained from analyses and numerical integrations of atmospheric models have the multifractal structure predicted by multiplicative cascade models. We quantify this for the horizontal wind, temperature, and humidity fields at 5 different pressure levels for both the ERA40 reanalysis, the Canadian Meteorological Centre Global Environmental Multiscale (CMC, GEM) model, as well as the National Oceanographic and Atmospheric Administration Global Forecasting System (NOAA, GFS). We investigate the additional prediction that the cascade belongs to a universal multifractal basin of attraction. By demonstrating a “Levy collapse” of the statistical moments to within ±2 to ±5% over most of the range of scales, we conclude that there is good evidence for this. Finally, we discuss how this multiplicative cascade structure can be exploited in improving ensemble forecasts.


Cascades in direct numerical simulations
Before turning our attention to analysis of atmospheric models, we should also mention the related field of pure hydrodynamic turbulence which is often considered more fundamental than atmospheric turbulence, but where the same issues have arisen. They too can involve structures spanning huge ranges of scale and despite intense efforts over more than 50 years, analytic approaches have been surprisingly ineffective.
The limitations of statistical closure, renormalization, and other kindred analytical techniques have lead to the development of two main alternatives: brute force numerics and phenomenological models, especially cascades.
Although Direct Numerical Simulations (DNS; i.e. without subgrid "parametrizations") of Navier Stokes (NS) have been made since Orszag and Patterson (1972), it wasn't until Vincent and Meneguzzi (1991) that computers were powerful enough to allow for simulations large enough to display hints of the Kolmogorov k -5/3 spectra, the traditional signature of the inertial range. The DNS inertial range is limited primarily because the dissipation is usually modelled with a Laplacian operator that is typically significant over a range a large factor (≈ 50) in scale.
Recent Earth Simulator integrations on a 4096 3 grid are now able to display roughly an intermediate (inertial) range spanning a factor of ≈100 in scale (Yokokawa et al., 2002, Kaneda, 2003, but require massive computational efforts.

Goals and structure
The aim of this paper is thus to test the predictions of multiplicative cascades on the spatial structure of various numerical models of the atmosphere. Since we find that the velocity field is scaling, we expect that there is also a temporal cascade structure. We confirm this in a forthcoming publication, and discuss the relationship between the spatial and temporal structures.
This paper is structured as follows. In section 2 we review the basic theory including the predictions of the multiplicative models and different methods of estimating fluxes. In section 3 we describe the data sets and present the basic results, in section 4 we conclude.

Basic statistics
In multiplicative cascades large structures are broken up into smaller daughter structures which multiplicatively modulate the flux; this process is repeated to smaller and smaller scales.
Normalized cascade processes generally lead to multifractal fields with statistics: where "<.>" indicates ensemble (statistical) averaging, ϕ is the turbulent flux (e.g. the energy flux ε) normalized such that <ϕ λ > = 1, K(q) is a convex function describing the scaling behaviour the q th moment, λ is the ratio of the (large) scale L where the cascade starts to the scale of observation l. Since the cascade is multiplicative, its logarithm, the "generator" is additive. It is therefore not surprising that -due to the additive central limit theorem for the sums of identical independently distributed random variables -there are specific (stable, attractive) "universal" forms for the exponent K(q): where 0 ≤ C 1 ≤ d is the "codimension of the mean", which characterizes the sparseness of the set that gives the dominant contribution to the first order statistical moment (the mean), d is the dimension of the space over which the cascade is observed (Schertzer and Lovejoy, 1987) The multifractal index 0 ≤ α ≤ 2 characterizes the degree of multifractality, i.e. the shape of the K(q) function. It is also the Levy index of the generator. If the cascade is uni/mono-fractal, then 0 α = whereas 2 α = corresponds to the 'lognormal' multifractal. A "universal multifractal" is the basin of attraction for wide variety of different multiplicative processes. In our analyses, we will see that the universal form (Eq. (1b)) fits the empirical K(q) quite well so that irrespective of whether the numerical models are indeed universal multifractals, the parameters C 1 , α give very convenient parametrisations for their forms.

Estimating the turbulent fluxes
In order to test Eq. (1a), we must therefore use an approach that does not require a priori assumptions about the physical nature of the relevant fluxes nor of their scale symmetries (isotropic or otherwise); the latter will in fact turn to out to be anisotropic and hence Note that if the fluxes are realizations of pure multiplicative cascades then the normalized η power ε η / ε η are also pure multiplicative cascades, so that / ϕ ϕ is a normalized cascade quantity. The fluctuation, ( ) f x Δ Δ , can be estimated in various ways; in 1-D a convenient method is to use absolute differences: Δf Δx resolution where x is a horizontal coordinate. This "poor man's wavelet" is usually adequatewhen as is typically the case 0 ≤ H ≤1 -but alternatively other definitions of fluctuations (other wavelets) could be used. In 2-D, convenient definitions of fluctuations are the (finite difference) Laplacian (estimated as the difference between the value at a grid point and the average of its neighbours: Since empirical data are nearly always sampled at scales much larger than the dissipation scales, the above scaling range based technique has general applicability. In numerical models however, we have data down to the dissipation range, we find that the approach still works, but that the interpretation is a little different. To see this, consider the example of the energy flux ε, recalling that at the dissipation scale: where ν is the viscosity, v the velocity. Standard manipulations give: so that if x is in the dissipation range (e.g. the finest resolution of the model) then: The models actually use hyper-viscosities, which has the advantage of confining the dissipation to a small range of scales (about a factor of 3). This means that their dissipation is due to a Laplacian raised to the power h (typically h is either 3 or 4), we have: where ν * is the hyperviscous coefficient. If Δ x is in the (smooth) dissipation regime, this leads to the estimate: In all cases -irrespective of h we therefore have: We see that this is the same as Eq.
(2), the only difference is that for the wind field, the dissipation exponent η = 1/2 rather than η = 1/3 which holds in the scaling regime. If we introduce ( ) K q η which is the scaling exponent for the normalized η flux / Since for the wind we find α ≈ 1.8 we find C 1diss /C 1scaling ≈ 1.5 1.8 =2.07.
The extension of this discussion to passive scalars is also relevant and shows that the interpretation of the empirically/numerically estimated fluxes in terms of classical theoretical fluxes can be nontrivial. Denoting by ρ the density of the passive scalar, and χ its variance flux, the dissipation range formula analogous to Eq. (3) is ( is the molecular diffusivity) leading to whereas the corresponding formula in the scaling range is (Corrsin-Obukhov) which has the dependency on χ , but which also involves the energy flux; the combined effective flux measured by the scaling method thus involves two (presumably statistically dependent) cascade quantities. In summary, although both dissipation and scaling ranges can be used to test for multiplicative cascades and to quantify their variability, the relation between the two is not necessarily trivial.
A final practical consideration is that in the analyses, the outer scale is not known a priori, but is an empirically estimated parameter. It is therefore convenient to define a reference scale to nondimensionalize the fluxes so that the basic prediction Eq. (9) of multiplicative cascades is that the normalized moments M q = ϕ λ q / ϕ 1 q are expected to obey the generic multiscaling relation: where "<.>" indicates statistical (ensemble) averaging and L eff is the effective outer scale of the cascade. ϕ 1 is the ensemble mean large scale (i.e. the climatological value). λ is a convenient scale ratio based on the largest great circle distance on the earth: L earth = 20,000 km and the scale ratio λ / λ eff is the overall ratio from the scale where the cascade started to the resolution scale L.

Discussion
To our knowledge there have been no attempts to directly check Eq. (3) on either DNS or geophysical numerical simulations. The closest is perhaps the multifractal characterization of time signals in turbulent shell models (Biferale, 2003) or in "scaling cascade of gyroscopes" models Schertzer, 1996, Chigirinskaya et al., 1998). While the former discretizes the NS equations in Fourier space keeping a small and fixed number of degrees of freedom per octave in scale, the latter more realistically discretizes the equations on a dyadic tree structure such that the number of degrees of freedom increases with wavenumber. Both approaches lead to multifractal behaviours in time (and hence presumably temporal cascades), and the latter to space-time cascades and are hence particularly relevant here. Other relevant connections between the cascade prediction, Eq. (3), and dynamical equations are the studies of temporal scaling (Syroka and Toumi, 2001, Blender and Fraedrich, 2003, Fraedrich and Blender, 2003, and temporal multifractality of climate models (Royer et al., 2008).
Our goal is to check the basic prediction of the multiplicative cascade models (Eq. (3)) directly on simulations of the atmosphere (both forecasts and reanalyses). This choice was made both due to the ready availability of large numbers of realizations and due to the scientific (weather, climate) significance of the results. Furthermore, as mentioned previously, we expected to have a significant range of scales exhibiting cascade behaviour because hyper-viscosity restricts most the effects of dissipation to a narrow range of scales (see however Frisch et al. (2008) for possible "side effects").
Over the scaling/inertial range, there are three main differences between the cascade structure of 3D DNS and of atmospheric models. The first is that in the former, there is a single (energy flux) cascade, while in the latter we expect there to be several coupled cascades. The second is that due to gravity, the atmosphere is stratified so that the cascades are anisotropic (Schertzer and Lovejoy, 1987) (indeed as mentioned earlier -due to the 10 km scale height of the mean pressure field, isotropic models require at least two cascades). The third is that the atmospheric boundary conditions are quite different from those typically used in DNS applications to fully developed turbulence in which the forcing is deliberately confined to the largest scales and the dissipation to the smallest. In particular both the topography (Gagnon et al., 2006) and critical energy containing short and long wavelengths radiative forcing (Lovejoy et al., 2001(Lovejoy et al., , 2009a have been found to have wide scale range cascade structures so that the boundary conditions do not introduce characteristic scales and so need not destroy the cascades.

Discussion
We chose two forecast models and one reanalysis, all recognized as being state-of-the-art: Forecasting's (ECMWF) reanalysis (ERA40). For all three products, we analyzed the three most dynamically significant fields: temperature (T), east-west (u) wind fields and specific humidity (h s ; GEM, ERA 40) and relative humidity (h r , GFS). In order to check for possibly latitude dependencies, analyses were made both in the regions between ± 30˚ (tropics) and ± 45˚ latitude.
We did not consider the spatial cascade structure of climate models since their resolutions are relatively low.

Multiscale (GEM) model)
This model is on a 0.25˚x0.3˚ horizontal grid with 28 levels and our analysis used a 0.6˚x0.6˚ resolution product (about 66 km resolution (High-resolution CMC GRIB dataset)). We used 505 time steps taken from September 20 th , 2007 to June 2nd, 2008 which are initialized at either 12Z or 00Z and analysed the initial objective analysis, the 48 hour forecast, and a 144 hour forecast.
The model can be adapted. See Côté et al. (1998aCôté et al. ( , 1998b) for more details.

The NOAA Global Forecast System (GFS) model
Like the CMC GEM model, the GFS is a global NWP model, which we also analyzed at its analysis and 48h forecast. It uses T254 Spectral and 768x384 Gaussian grids on 64 vertical levels. The data is obtained at a 1ºx1º resolution every 6 hours; each initialization starts at 00Z, 06Z, 12Z, or 18Z. The data is taken from August 1, 2007 to June 30, 2008 (with the exception of 700 mbar u where the first 61 days were corrupted). A total of 1340 time steps were analyzed.

The European Centre for Midrange Weather Forecasting's (ECMWF) reanalysis (ERA40) product
A reanalysis is the result of assimilating atmospheric measurements with numerical forecasts in an attempt to obtain realistic fields; it is a very model dependent "product". For ERA40, the dynamic variables are on a 2D triangular spherical harmonic truncation T159 with 60 levels (Uppala et al., 2005) Fig. 1 shows plots of M q for T, u, h s for both the GEM analysis (t=0; Fig. 1a, c, e) and 144 hour forecast (Fig. 1b, d, f) at 1000 mbar, while Fig. 2 shows the corresponding plots (h r instead of h s for GFS) for ERA40 (Fig. 2a, c, e) and GFS analysis datasets (Fig. 2b, d, f) at 1000 mb and Fig.   3 shows the ERA 40 trace moments of u field at 700 mb for all datasets taken between ±30° (Fig.   3 a, c, e) and ±45° latitude (Fig. 3 b,  To convert Δ to a percent deviation, use 100(10 1) δ Δ = − ; we found 2% δ < ± for all analysed fields and 1% δ < ± for all initial analyses and short-range forecasts, very close to those of the same moments of the visible and IR radiances over the range 10 -5000 km (≈±0.5% (Lovejoy et al., 2009a)).

Testing cascade statistics
For the fields, the scaling extends from the grid size up to ~5000 -20000 km. It is worth noting that the outer scales for these fields (~8000-27000 km) are approximately the same as the outer scales of the radiances (~5000 -32000 km in Lovejoy et al. (2009a), although L eff is somewhat larger for north-south directions when compared to the east-west direction statistics here) and that the C 1 's are not too different from the C 1 for a passive scalar (~0.1) (Lilley et al., 2004).
Similar results were found for the moments of the 2D wind estimate of the energy flux.
From the table, we see that very similar results were found for GEM forecasts (Fig. 1 b, d, f) and the GFS model; for example, the deviations are of the order ±0.3% for the GEM and ±0.5% for the 48 h GFS forecast for the analysis and 48 h forecast (Tables 1 a,b,c). These small deviations allow us to conclude that the analyses and models do indeed accurately have a cascade structure.
Overall, from the table, we can also see that the K(q) "shape parameter", the difficult to estimate multifractal index α is roughly constant at α = 1.8±0.1. From Table 1a, we see that the scale by scale characterization of the intermittency near the mean (C 1 ) has a tendency to decrease with altitude, this being somewhat amplified by a decrease in the external scale (which decreases all the moments by the same factor). Interestingly, the C 1 is very similar for the different fields (it is slightly larger for the humidity), although significantly, the C 1 are quite a bit larger than those measured by aircraft (Sect. 2.5 in Lovejoy et al. (2009b)), also shown in the table.
Also in Table 1 is a comparison of aircraft estimates of the parameters (Lovejoy et al., 2009b).
Since the aircraft fluxes were estimated in the scaling regime, we don't expect the parameters to be identical to the dissipation range fluxes estimated here (see the discussion in Sect. 2).
However, when the theoretical correction factor (assuming α = 1.8) is used, the agreement is seen to be excellent.
In We also check that α for the same field but at different altitudes is roughly constant. This is done by calculating the "reduced moments" c from Eq. (1b) to separate the changes due to the mean intermittency C 1 from changes in the shape due to different L eff 's and α 's. If the reduced moments are equal, then the only difference is the mean intermittency (C 1 ) parameter. Fig. 4 shows the reduced moments for u, T, h s for GEM and ERA40 (the 48 hour GEM forecast was very similar to the analysis) for q = 0.5 and q = 2. We see that the curves -including the small deviations from linearity at large scales (small λ ) are very close so that variations in C 1 do indeed capture much of the field to field variability.

Conclusions
Ever since Richardson speculated that cascades would be helpful in understanding atmospheric dynamics, they have been regularly invoked, especially in statistical theories (i.e. turbulence).
For many, cascades are treated as conceptual tools rather than as concrete models; indeed most applications of cascades do little more than identify a candidate cascade quantity, and then on the basis of dimensional arguments, determine (nonintermittent) spectral exponents. Statistical mechanical-type arguments are sometimes further invoked to determine the direction of the cascade. However in this approach, turbulence theory plays a key in the identification of the cascade quantity, and these theories have invariably been isotropic (in either three or two spatial dimensions). In the last 12 years, when this classical approach has been used to test cascades on large scale numerical weather models and related products (especially, Atmospheric GCM for the Earth Simulator (Hamilton et al., 2008) and ERA40 (Straus and Ditlevsen, 1997)) it has been found to fare rather poorly. The consequence is that there is no currently accepted turbulence interpretation of the model statistics.
However starting in the 1960's -in order to help understand intermittency -, precise, explicit, multiplicative cascades models were developed. These are phenomenological models designed to reproduce some of the symmetries of the governing dynamical equations, specifically the scale-by-scale conservation of turbulent fluxes, the scale invariance symmetries of the dynamical mechanism, and localness in Fourier space (so that interactions are primarily between structures of similar size). It is now understood that the implications are quite genericthey are insensitive to many details so that the statistics obey Eq. (1) (hence also the existence of universality classes). On the other hand, no specific a priori assumptions need to be made about either isotropy or the physical nature of the cascade quantity; the (generally anisotropic) scale symmetries play a crucial role.
Using (now standard) data analysis techniques (here, "trace moments") we convincingly demonstrate that three leading numerical models of the atmosphere accurately follow the predictions of multiplicative cascade models, including one (ERA 40) where the classical (isotropic turbulence) approach failed to follow the predictions of the isotropic theories (Straus and Ditlevsen, 1997). In retrospect, it is fortunate that the models nearly perfectly follow the cascade predictions because an increasing number of analyses of empirical atmospheric data find that the atmosphere also has a cascade structure, so that the data and numerical models are at least in structural agreement. The cascade structure of the intialisation fields makes it possible that the results are at least partially artifacts due to the constraint that they are near the (cascadelike) data at t = 0. However, the forecasts -both short (48h) and medium-range (144h) -show that the model statistics are little changed except for the increase in δ . This suggests -but does not prove -that if the same models were run in "climate-mode" -i.e. for very long integrationsthat they would maintain their cascade structure.
Over the period where numerical and multiplicative cascade models were developed in parallel, they have sometimes seemed irreconcilable -if only because the former are deterministic with strong scale truncations, whereas the latter are stochastic and over wide ranges of scale.
However in the last 16 years, with the development of "ensemble" forecasting (e.g., Zoth and Kalnay, 1993), there has been a revolution in attitudes about forecasting; it is now increasingly accepted that the goal is no longer a deterministic forecast of the weather (or climate) but rather the production of a distribution of possible future atmospheric states including their relative probabilities; i.e. a probabilistic forecast. At the moment, this goal can only be achieved by a nontrivial marriage between the deterministic models and stochasticity, which is currently artificially introduced via relatively ad hoc methods of generating random initial conditions (e.g. "ensemble breeding", see Corazza et al., 2003). Other ad hoc attempts such as Buizza et al. (1999) and Palmer (2001) to introduce the required stochasticity include attempts at subgrid "stochastic parametrisation", which already recover the Kolmogorov k -5/3 spectral slope down to the grid scale. With the findings of this paper that cascades accurately describe the stochastic structure of the equations, several new avenues for modelling appear. The stochastic parameterization in this case can be used to properly implement an ensemble forecast -so differences in each member of the ensemble appear because of changes in the fields due to the stochastic parameterization. In the short term, using the cascades as theoretically "clean" subgrid parameterisations is promising, while in the medium to long term, quite new purely stochastic forecasting techniques will be possible by exploiting the long range memory implicit in the cascade structures as mentioned in Scherzter and . C 1 from the scaling range C 1 , see Eq. (9)).