Fractional Brownian motion, the Matern process, and stochastic modeling of turbulent dispersion

Stochastic process exhibiting power-law slopes in the frequency domain are frequently well modeled by fractional Brownian motion (fBm). In particular, the spectral slope at high frequencies is associated with the degree of small-scale roughness or fractal dimension. However, a broad class of real-world signals have a high-frequency slope, like fBm, but a plateau in the vicinity of zero frequency. This low-frequency plateau, it is shown, implies that the temporal integral of the process exhibits diffusive behavior, dispersing from its initial location at a constant rate. Such processes are not well modeled by fBm, which has a singularity at zero frequency corresponding to an unbounded rate of dispersion. A more appropriate stochastic model is a much lesser-known random process called the Matern process, which is shown herein to be a damped version of fractional Brownian motion. This article first provides a thorough introduction to fractional Brownian motion, then examines the details of the Matern process and its relationship to fBm. An algorithm for the simulation of the Matern process in O(N log N) operations is given. Unlike fBm, the Matern process is found to provide an excellent match to modeling velocities from particle trajectories in an application to two-dimensional fluid turbulence.


Introduction
Fractional Brownian motion (fBm), introduced by Mandelbrot and Van Ness (1968), is a canonical stochastic process finding wide-ranging applications in fields as diverse as oceanography (Osborne et al., 1989;Sanderson et al., 1990;Sanderson and Booth, 1991;Summers, 2002), geophysics (Molz et al., 1997), finance (Rogers, 1997), and many others. The essential features of this process are its self-similar behavior-meaning that magnified and rescaled versions of the process appear statistically identical to the originaltogether with its nonstationarity, implying a never-ending growth of variance with time. Two other properties of fBm are its degree of small-scale roughness or fractal dimension (Mandelbrot, 1985;Falconer, 1990, Chapters 2 & 3), and the nature of its long-term memory or long-range dependence (Beran, 1992(Beran, , 1994. As pointed out by Gneiting and Schlather (2004), the self-similarity of fractional Brownian motion links the very small and the very large temporal scales behavior together, such that its memory, fractal dimension, and self-similarity aspect ratio are all controlled by the same parameter. These, in turn, are all connected to the slope of the spectrum in the Fourier domain, in which fBm is found to exhibit a simple power-law behavior.
One important property that cannot be captured by fractional Brownian motion is the tendency for a process to diffuse, or disperse from an initial location at a uniform rate. In the fluid dynamics literature (e.g. Davis, 1983;LaCasce, 2008), it is known that the zero-frequency value of the spectrum of a process quantifies the dispersive tendency of the temporal integral of that process. This recognition leads to a classification of processes, proposed here, based on their spectral value at zero frequency. We refer to random processes as diffusive, subdiffusive, or superdiffusive, depending on whether the spectral value is finite and nonzero, zero, or unbounded, respectively. This quality of "diffusiveness" will be shown to be related to, but distinct from, the more familiar classification of processes as short-memory or long-memory depending on the long-time behaviors of their autocovariance functions (Beran, 1992(Beran, , 1994Gneiting and Schlather, 2004). Fractional Brownian motion is found to be superdiffusive, and is associated with a diffusivity that tends to increase without bound.
A particular application is the stochastic modeling of velocities obtained from particle trajectories in fluid flows. In the field of oceanography, one of the main windows into studying the physics of the ocean circulation consists of position data from instruments that drift freely with the currents (Rupolo et al., 1996;Rossby, 2007;Lumpkin and Pazos, 2007). Similarly, numerical models of fluid systems are frequently analyzed by examining the motion of particles carried with the flow (Pasquero et al., 2002;Veneziani et al., 2005a;Lilly et al., 2011). Such position records are known as Lagrangian trajectories, on account of the moving frame of reference associated with the particles or instruments.
One thread of research attempts to predict Lagrangian statistics based on dynamical assumptions (e.g. Griffa, 1996;Majda and Kramer, 1999;Berloff and McWilliams, 2002;Veneziani et al., 2005a;Majda and Gershgorin, 2013). Here, we instead try to identify the simplest stochastic model that can explain the major observed features, leaving the connection to the equations of motion to the future. Velocities from Lagrangian trajectories are found (e.g. Rupolo et al., 1996) to exhibit power-law behaviors at high frequencies, and indeed fractional Brownian motion has been suggested as a stochastic model (Osborne et al., 1989;Sanderson et al., 1990;Sanderson and Booth, 1991;Summers, 2002). Yet a primary characteristic of these trajectories is their tendency to diffuse at a uniform rate at long times (Taylor, 1921;Davis, 1983;LaCasce, 2008;Koszalka and LaCasce, 2010), a feature that fBm cannot capture.
A type of random process having a sloped spectrum that matches fBm at high frequencies, but that takes on a constant value in the vicinity of zero frequency, exists and is known as the Matérn process (Matérn, 1960;Guttorp and Gneiting, 2006). The same process has been referred to occasionally as the fractional Ornstein-Uhlenbeck process (Wolpert and Taqqu, 2005;Lim and Eab, 2006), because it also generalizes the well-known Ornstein-Uhlenbeck process (Uhlenbeck and Ornstein, 1930) to fractional orders. A multivariate version of the Matérn process is broadly used for spatial statistics in various fields (Goff and Jordan, 1988;Handcock and Stein, 1993;Gneiting et al., 2010;Lindgren et al., 2011;Schlather, 2012). Yet despite the appeal of its generality, the Matérn process appears in only a handful of papers in the time series literature (Wolpert and Taqqu, 2005;Lim and Eab, 2006;Li et al., 2010;Hartikainen and Särkkä, 2010;Sykulski et al., 2016aSykulski et al., , 2017. In fluid dynamics, the only instances we are aware of is an application to wind tunnel data by Von Karman (1948), pointed out by Guttorp and Gneiting (2006), together with a more recent study by Hedevang and Schmiegel (2014).
The purpose of this paper is to investigate the theoretical properties of the Matérn process, in particular its relationship to fractional Brownian motion, and to establish the practical importance of this under-appreciated process for modeling time series that exhibit the fundamental phenomenon of diffusion. On the theoretical side, the Matérn process is seen to be a damped version of fractional Brownian motion, in the same way that the Ornstein-Uhlenbeck process is a damped version of standard Brownian motion. A simple generalization of the Matérn process that incorporates a uniform rotation rate is shown to describe a forced/damped fractional oscillator. By "damped version", we mean that the process is modified as would be expected if a physical damping were introduced into its stochastic differential or stochastic integral equation. This terminology, which draws upon intuition for damped and undamped oscillators from elementary physics, will be made more clear in Section 4.4.
On the practical side, we find the Matérn process to be an excellent match for Lagrangian velocity spectra from a numerical simulation of two-dimensional turbulence, a classical system in fluid dynamics that has been the subject of a large number of studies, (e.g. Lin, 1972;McWilliams, 1990a;Dritschel et al., 2008;Bracco and McWilliams, 2010;Kadoch et al., 2011;Scott and Dritschel, 2013). The Matérn process allows one to simultaneously vary the values of the three most important properties of Lagrangian trajectories: the kinetic energy, the degree of small-scale roughness or fractal dimension, and the long-time diffusive behavior. Thus, it is arguably the simplest stochastic model that can capture the essential features of such data.
A transition of the spectrum to constant values at sufficiently low frequencies is expected to be a common feature of many physical systems. Systems are often characterized by a pressure to grow-represented by a forcing-together with some drag or resistance on that growth, represented by a damping. After a sufficiently long time, the forcing and the damping equilibrate and one reaches a bounded state. This leads to the speculation that many time series that are well described as fBm over relatively short timescales may be better matched by the Matérn process over longer timescales. More generally, the Matérn process adds a third parameter (damping) to the two parameters (amplitude together with spectral slope or the Hurst parameter) of fBm, thus permitting a wider range of spectral forms to be accommodated. It is therefore reasonable to think that the Matérn process could be of broad interest in many areas in which fBm has already proven itself useful.
Many of the results herein may be found somewhere in the literature; the novelty and significance of this paper arise from placing these results in context. The relevant literature is vast, and the results that form this narrative are widely distributed within disparate communities. The concept of diffusivity discussed in Section 2 is well known within physics and fluid dynamics, but is largely unheard of in the time series literature. The Matérn process investigated in Section 4 is well known in spatial statistics, but not in time series or in fluid dynamics. That the Matérn process is essentially damped fractional Brownian motion, one of our main points, has already been recognized by Lim and Eab (2006), who, however, appear to have come upon the Matérn form independently, without using this name and without referencing the existing literature. Thus, the various results brought together here currently exist in such a dispersed state that the significance of combining them is not at all apparent.
The main contributions of this work are: (i) to place the Matérn process in context by understanding its relationship to fractional Brownian motion; (ii) to establish why the Matérn process is important for stochastic modeling of time series, geophysical time series in particular, which is its ability to simultaneously capture the effects of long-timescale diffusivity and small-scale fractal dimensionality; (iii) to demonstrate its performance with an application to a classical physical system; and (iv) to accomplish these goals in a way that is accessible to a general audience. This paper was inspired by the need to develop a stochastic model for a particular physical application. As such, we are cognizant of the need to make stochastic modeling tools accessible to a broad audience. We have therefore endeavored to present material in a manner that is grounded in concepts from signal analysis, as this is a common language shared by many fields. A priority is placed on being self-contained, in order to avoid referring the reader repeatedly to the literature. The use of stochastic differential equations, or other more mathematical tools, is avoided unless absolutely necessary. At the same time, we are aware of the need to maintain rigor, and have therefore sought to carefully qualify any approximate or informal statements. New results are denoted as such.
The structure of the paper is as follows. Section 2 introduces background material regarding the concept of diffusivity and its relationship to the spectrum, and presents a preview of the application to turbulence as a motivation. An introduction to fractional Brownian motion is presented in Section 3. The properties of the Matérn process are then investigated in Section 4. Section 5 presents a new algorithm for fast approximate numerical generation of the Matérn process, and Section 6 returns to the application with additional details. The paper concludes with a discussion.
All numerical software associated with this paper, including a script for figure generation, is distributed as a part of a freely available Matlab toolbox, as described in Appendix A. The paper includes two supplemental animations, http: //www.jmlilly.net/videos/dispersionmovie.mp4 and http:// www.jmlilly.net/videos/turbulencemovie.mp4.

Background and motivation
This section introduces background material on stochastic processes, and identifies the diffusivity as a fundamental second-order stochastic quantity. This importance of diffusivity is illustrated by briefly discussing an application to modeling particle velocities in fluid turbulence.

Complex notation, continuous time
In this paper, we will work with continuous-time, complexvalued processes, a choice that deserves comment. The decision to use complex-valued processes stems from the fact that the main application, to fluid dynamics, consists of analyzing trajectories that may be regarded as positions on the complex plane. For the most part, the results all apply equally well to real-valued processes. The choice to work in continuous time reflects more than convenience, as physical phenomena are generally regarded as existing continuously in time. A discrete time series arises when a process, such as a fluid flow, happens to be sampled at discrete intervals, owing to the constraints of measurements with real-world instruments. For these reasons, we will work in continuous time, and discrete sampling effects will be addressed when relevant.

Autocovariance and spectrum
Let z(t) = u(t) + iv(t) be a potentially nonstationary, complex-valued, zero-mean random process, where i ≡ √ −1. For concreteness herein, z(t) will be regarded as having units of velocity, with u(t) and v(t) giving eastward and northward velocity components, respectively. The autocovariance function of z(t) is defined as where the asterisk denotes the complex conjugate; note this satisfies the symmetry R zz (t, τ ) = R * zz (t + τ, −τ ). If it is the case that z(t) is second-order stationary, meaning that its second-order statistics are independent of global time t, the autocovariance function is written as R zz (τ ). In this case one finds R * zz (−τ ) = R zz (τ ), and thus the autocovariance function of a stationary complex-valued stochastic process has Hermitian symmetry. Another useful property of R zz (τ ) is that it is rotationally invariant in the x-y plane: if one rotates the process counterclockwise through some some constant angle Θ by definingz(t) ≡ e iΘ z(t), we have Rzz(τ ) = R zz (τ ), and the autocovariance function remains unchanged.
It is well known that the autocovariance function of a complex-valued process does not completely characterize its second-order statistics (Mooers, 1973;Picinbono and Bondon, 1997;Schreier and Scharf, 2003). Additional information is contained within a second covariance function which is the covariance between z(t) and its own complex conjugate. 1 This quantity is variously known as the relation function (Picinbono and Bondon, 1997) or complementary autocovariance function (Schreier and Scharf, 2003) or pseudo-covariance (Neeser and Massey, 1993) in the time series literature, and as the outer autocovariance in oceanography and atmospheric science (Mooers, 1973). Unlike the autocovariance function, the relation function changes with a coordinate rotation. Withz(t) ≡ e iΘ z(t) again being a rotated version the process, one finds Czz(τ ) = e i2Θ C zz (τ ). This shows that information regarding the directionality of variability must reside in C zz (t, τ ) and not in R zz (t, τ ). If the process is isotropic, meaning that its statistics are independent of the rotation angle Θ, then clearly C zz (t, τ ) must vanish; the process is then said to be proper or circular or circularly symmetric. In the present paper we are concerned with isotropic processes, and we will therefore limit our attention to R zz (t, τ ). The statistical information contained in the autocovariance function of a second-order stationary process, R zz (τ ), can be equivalently expressed in terms of its Fourier transform, the spectrum S zz (ω), through the inverse Fourier relationship Rather than needing to deal separately with an eastward or u-velocity spectrum and a northward or v-velocity spectrum, the spectrum of the complex-valued velocity z(t) = u(t) + iv(t) compactly includes contributions due to positivelyrotating circular motions e i|ω|τ for ω > 0, and those due to negatively-rotating circular motions e −i|ω|τ for ω < 0. For this reason S zz (ω) is referred to as the rotary spectrum in the oceanographic and atmospheric science literature (Fofonoff, 1969;Gonella, 1972;Mooers, 1973;Emery and Thomson, 2014, Chapter 5.4.4.2). Unlike the spectrum of a real-valued signal, the rotary spectrum is in general not a symmetric function of ω. Because physical processes are generally better separated in the frequency domain than in the time domain, and because the spectrum is a more straightforward quantity to estimate than is the autocovariance, we will work with the spectrum rather than the autocovariance for stochastic modeling.

Diffusive processes
The time integral of the velocity process z(t) defines a complex-valued displacement or trajectory on the complex plane, denoted by where the integral is interpreted as − 0 t z(τ ) dτ for t < 0. This definition of r(t) sets the initial condition r(0) = 0. Drawing on a key concept from physics we introduce the total or isotropic diffusivity as which quantifies the expected rate at which the particles disperse, or spread out, over time from an initial location.
Here E{·} is the expectation operator. Note that κ(t) is the defined as the average of the rates of dispersion in the xand y-directions, κ If an ensemble of particles exhibits a power-law dispersion near some time t with then the local behavior is said to be diffusive if β = 1, subdiffusive if β < 1, and superdiffusive if β > 1. The same process may exhibit different diffusive regimes at different times, but if (6) holds in an asymptotic sense for large t, then the longtime limit of κ(t) is given by where the time-independent, asymptotic quantity κ is conventionally known simply as the diffusivity. In the case that κ is a nonzero constant, one has E |r(t)| 2 = 4κt, and the expected area enclosed by the particle ensemble grows linearly with time. Thus κ quantifies a tendency for random fluctuations to yield systematic outward or radial motion. The seminal work of Taylor (1921) applied the concept of diffusivity to study the random motions of macroscopic fluid particles, a usage that is now widespread in fluid dynamics (LaCasce, 2008). Here we employ the physical concept of diffusiveness to describe the long-term dispersive behavior of random processes in general, regardless of the system being represented.
While the diffusivity is not a recognized quantity in time series analysis, we will show that is an essential second-order descriptor, on par with the variance. If z(t) is a zero-mean second-order stationary process with autocovariance function R zz (τ ) and Fourier spectral density S zz (ω), and having variance σ 2 ≡ E |z(t)| 2 , one finds which shows that the variance σ 2 and diffusivity κ may be seen as time-and frequency-domain analogues of one another. The first of these relations is the well-known Parseval's theorem, while the second is shown in Appendix B. Just as ment. Recall that the diffusion equation with constant diffusivity, but differing diffusivities in the xand y-directions, is ∂ ∂t φ = κx ∂ ∂x 2 φ + κy ∂ ∂y 2 φ for some field φ(x, y, t). Under the assumption of isotropy, the definition κ = 1 2 [κx + κy] leads to the usual form of the diffusion equation ∂ ∂t φ = κ∇ 2 φ where ∇ 2 is the horizontal Laplacian. Defining κ instead as the sum of the component diffusivities would lead to a 1 2 κ appearing in this equation, which is not standard. This accounts for the factor of 1/4 in (5), rather than the more familiar 1/2 that is found in the definition of the component diffusivities κx and κy. the variance σ 2 is given by the integral of the velocity spectrum, or the value of the autocovariance at zero, the diffusivity κ is the integral of the autocovariance, or the value of the spectrum at zero. As each is the zeroth-order moment in one of the two domains, they share a common footing as the two lowest-order and potentially most important secondorder statistical properties of a stationary random process.
The result that the diffusivity is the zero-frequency value of velocity spectrum is not entirely new. It is implicit in a result of Kampé de Fériet (1939), see p. 527-528 of Monin and Yaglom (2007). It is also pointed out in Davis (1983, p. 175) and is mentioned in LaCasce (2008). However, this result does not appear widely appreciated in the ocean/atmosphere literature. Within the time series literature, there does not appear to be a recognition of the potential importance of the zero-frequency value of the spectrum on account of its connection to dispersive behavior.
Because the diffusivity appears as a second-order descriptor of the velocity process z(t), it is useful to categorize z(t) according to the associated diffusivity value. For a given z(t) we may define κ as in (9) through the value of the spectrum at zero frequency, or equivalently, through the integral of the autocovariance. We will refer to z(t) as a diffusive process if it is associated in this way with a non-zero and finite value of κ. Processes associated with zero values of κ will be said to be subdiffusive, while those associated with unbounded values of κ will be referred to as superdiffusive. Note that the diffusivity is a property that can be associated both with the velocity process z(t), in the zero-frequency value of its spectrum, and the trajectory r(t), in its rate of dispersion. To avoid ambiguity, we will say that z(t) is a diffusive process whereas r(t) is a diffusive trajectory, and so forth for suband superdiffusive processes. 3 The classification of a stochastic process as diffusive, subdiffusive, or superdiffusive is related to a well-known property, the process memory. If the autocovariance of a finitevariance stationary process exhibits the long-term decay then the process is said to be a long-memory process or to have long-range dependence (Beran, 1992(Beran, , 1994Gneiting and Schlather, 2004). A short-memory process is one for which the autocovariance falls off more rapidly than |1/τ |, in which case the autocovariance function will be absolutely integrable; note that the statement R zz (τ ) ∼ |τ | −µ means that the magnitude of the autocovariance decays as |τ | −µ . Thus, short-memory stationary processes are those for which the autocovariance function is absolutely integrable, and longmemory stationary processes are those for which it is not.
The process memory is therefore a classification based on the absolute integrability of the autocovariance, whereas the diffusiveness is based on its integrability, as seen in (9). From this one may establish that both short-and long-memory processes can be diffusive or subdiffusive, but only longmemory processes can be superdiffusive. A long-memory process has an autocovariance that is not absolutely integrable, whereas a diffusive process has an autocovariance that is integrable and that integrates to a nonzero value. A function can be integrable but not absolutely integrable, thus a diffusive process can be long-memory. Similarly, both short-memory and long-memory processes could have autocovariances that integrate to zero, giving a subdiffusive process. However, if a function is absolutely integrable then it is also integrable, thus a short-memory process cannot be superdiffusive. For concreteness, examples of spectra of processes with different combinations of diffusiveness and memory are presented in Appendix C based on modifications to the Matérn process.

Application to 2D turbulence
In this paper, we will be concerned with an application to the stochastic modeling of particle trajectories, and the associated velocity time series, from a numerical simulation of fluid turbulence. The system we will use, known as forceddissipative two-dimensional turbulence, see e.g. Chapter 8.3 of Vallis (2006), generates temporally and spatially varying flows that exist purely in the horizontal plane. This system is considered an idealized representation of turbulence in planetary fluid dynamics. Details of the numerical model, including the model equations and parameter choices, are described in Section 6.1. The simulation is carried out in a doubly periodic domain 4 having physical dimension of 2500 × 2500 km, and is integrated for three years. The time series analyzed here are 512 particle trajectories taken from a total of 1024 that are tracked throughout this experiment, and that are initially uniformly distributed throughout the model grid at regular intervals.
A snapshot of the velocity field at the initial time, together with the particle trajectories from the entire simulation, is shown in Fig. 1. The quantity plotted in the left-hand panel is the current speed |U +iV | = √ U 2 + V 2 at time t = 0, where U = U (x, y, t) and V = V (x, y, t) are the velocities at each point in the domain. The roughly circular regions of highspeed currents correspond to long-lived swirling structures termed vortices or eddies. The emergence of vortices is one of the defining features of two-dimensional turbulence (e.g. McWilliams, 1990a). A method for their study based on trajectory data has been developed elsewhere (Lilly and Gascard, 2006;Lilly and Olhede, 2009;Lilly et al., 2011). The focus here, however, is on trajectories not directly influenced by such structures. For this reason, one-half of the trajectories are discarded in order to exclude those directly effected by vortices, using a criterion described in Section 6.1, leaving 512 trajectories that will be analyzed herein. The supplementary animation turbulencemovie.mp4 presents the evolution of these 512 trajectories superimposed on the speed as in Fig. 1a.
These 512 "eddy-free" trajectories are also displayed in Fig. 2a. Here, the position coordinates in the periodic domain have been unwrapped, and the resulting trajectories r(t) offset in the horizontal so as to begin at the origin at time t = 0. Dispersion is then visualized by the circles, which have been drawn with radiĩ r n ≡ E |r(n∆)| 2 (11) at uniformly spaced time intervals n∆t, with ∆ equal to six months and n = 1, 2, . . . 6. In this expression, the expectation operator is interpreted as the average over all 512 trajectories. For constant diffusivity, one expects thatr 2 n = 2κn∆ from (5), such that the total enclosed area increases linearly, and the radius increases as the square root of time. That the trajectories shown here are exhibiting diffusive behavior is thus indicated by the appearance of the circles in Fig. 2a, which become more closely spaced together as time increases.
The average estimated spectrum of the velocity signals z(t) corresponding to these trajectories is shown as the heavy black curve in Fig. 3. Non-parametric estimates of the velocity spectra have been formed for each trajectory by tapering with a lowest-order Discrete Prolate Spheroidal Sequence or "Slepian" taper (Slepian, 1978;Thomson, 1982;Park et al., 1987;Percival and Walden, 1993, Chapter 3.9) having a time-bandwidth product set to a value of 10, see p. 12,677 of Park et al. (1987) for a definition of this parameter. The spectra for all 512 velocity signals are averaged together, and because there is no expected difference between clockwise and anti-clockwise velocities, only spectra for positive frequencies are shown.
The velocity spectrum is observed to have three main features: an overall energy level, a high-frequency slope, and a low-frequency plateau. As shown in the preceding section, the low-frequency plateau of the velocity signals is a reflection of the diffusive behavior of the trajectories. The goal of this paper is to identify a stochastic model capable of reproducing these three features, and to thoroughly understand its properties.

Overview of stochastic models
Consider one-, two-, and three-parameter frequency spectra having the forms Figure 1. A snapshot of current speed from the turbulence simulation (left) together with 1024 particle trajectories (right). In the left panel, shading is the speed U 2 (x, y, t) + V 2 (x, y, t) at each point, with white corresponding to zero velocity and black to 18 cm s − 1. In the right-hand panel, different trajectories are represented by different shadings of gray. The physical domain size is 2500 × 2500 km, with the xand y-axes in this figure given in units of 1000 km. See turbulencemovie.mp4 for an animation of this figure, in which only the 512 particles to be analyzed are shown.
which are taken as models for the complex velocity time series z(t) from the turbulence simulation. The first type of spectrum corresponds to white noise. 5 The second is a power-law spectrum that arises for fractional Brownian motion (Mandelbrot and Van Ness, 1968) for α, termed the slope parameter, in the range 1/2 < α < 3/2. For the slope parameter α > 1/2, the third spectrum is that of a type of random process known as a Matérn process (Matérn, 1960;Guttorp and Gneiting, 2006), which we will show to be a damped version of fractional Brownian motion, with λ > 0 playing the role of an inverse damping timescale. Note that these three models are formally nested within one another: choosing λ = 0, the third becomes the second; and choose α = 0, the second becomes the first.
The form of the Matérn spectrum is fit to the velocity spectra of the turbulence trajectories, in a way that will be described in Section 6, to generate best-fit values of the three Matérn parameters (A, α, and λ) for each of the 512 trajectories. The low-frequency values from these fits are then used to match the white noise spectrum, while the parameters for the high-frequency slopes are used to match the power-law spectrum. For each set of parameters, realizations of these three types of random processes are constructed from the best-fit parameters using the methods described in Section 5. The spectra of the simulated trajectories are then estimated in the same manner as for the original trajectories, and shown in Fig. 3. As expected due to their construction, the white noise and power-law process match only the low-frequency plateau or high-frequency spectral slope, respectively, of the original spectra.
The Matérn spectral form is seen to provide an excellent match to the observed Lagrangian velocity spectra over roughly eight decades of structure. The high-frequency slope is seen to be roughly |ω| −8 , a very steep slope. We are not aware of any physical theory to account for this, nor for the value of the damping parameter λ. Despite the fundamental role that the Eulerian wavenumber spectrum of velocity plays in turbulence theory, the Lagrangian frequency spectrum has received relatively little attention. Attempting to connect the observed form of this spectrum to physical principles is, however, outside the scope of the present paper.
These three different sets of random processes for the velocity time series are then cumulatively summed to form trajectories, and are compared with the original trajectories in Fig. 2; note the axes limits in Fig. 2d are a factor of one million times larger than in the other panels, a consequence of the growth of the variance to enormous values. The tur-  [See figure on previous page] Dispersion curves for the three-year turbulence trajectories and the three different stochastic models discussed in Section 2.5. Panel (a) shows 512 "eddy-free" trajectories, chosen from a larger set of 1024 as described later in Section 6.1. All curves have been offset such that the initial points are located at the origin. Panel (b) shows realizations of a Matérn random process using parameters fit to the velocity spectra of each trajectory, and then cumulatively summed to produce a displacement, also with the initial condition at the origin. Similarly, the lower two panels show trajectories corresponding to white noise velocities (c) and velocities for a power-law process (d), the latter approximated using a Matérn process with very low damping. The stochastic velocities in (c) are chosen to match the low-frequency spectral levels of the turbulence trajectories, while in (d) they are chosen to match the high-frequency spectral slope. All trajectories in the doubly-periodic domain have been unwrapped for presentational clarity, with the gray square in each panel showing the domain size. Note that the x-and y-axes in panel (d) are a factor of one million times larger than those of the other panels, which is why the gray box is not visible. In each panel, black circles show the root-mean-square distance from the originrn defined as in (11). Circles are drawn every six months, beginning at six months and ending at three years. The circles in (d) do not become closer together with increasing radius, indicating superdiffusive behavior. See dispersionmovie.mp4 for an animation of the first two panels of this figure.  Fig. 2. Estimated rotary spectra Szz(ω) are shown for positive frequencies only, since the negative frequency side is statistically identical. The first four curves show the mean values of the estimated spectra, formed as described in the text, of each of the four sets of trajectories shown in Fig. 2. The fifth curve, indicated by a solid line, shows a slope of −8, corresponding to a slope parameter α = 4. The dotted horizontal line marks the approximately limit of double numerical precision for the power law process, 15 orders of magnitude below its maximum; this numerical precision limit accounts for the flattening of gray dashed curve. bulence trajectories and the synthetic trajectories generated from the Matérn model are observed to be virtually indistinguishable in character. See the supplementary file dispersionmovie.mp4 for an animation of the upper two panels of Fig. 2, showing the good agreement between the Matérn trajectories and the turbulence trajectories.
By contrast, the one-parameter and two-parameter spectral models provide poor fits to the observed trajectories, see Fig. 2c,d. The trajectories associated with the white noise velocities match the dispersion curves closely, but the trajectories are far too rough in appearance. When set to match the high-frequency spectral slope and thus the trajectory behavior at small scales, the power-law model for velocity spectra yields trajectories with a vastly incorrect range, too high a degree of smoothness at the large scale, and dispersion characteristic of a continually increasing diffusivity.
Thus, the white noise model is able to correctly match the large scale, low-frequency component of the velocity spectra that accounts for the diffusive behavior of the trajectories. The power-law model is able to correctly match the highfrequency component of the spectrum that sets the smallscale roughness. The Matérn spectrum allows one to match both. This provides a compelling example that motivates examining the Matérn process in more detail.

Fractional Brownian motion
This section reviews the properties of fractional Brownian motion, focusing on the central importance of the spectrum. With a few noted exceptions, this section presents material that is already known in the literature. Readers already very familiar with this process may wish to skip to the description of the Matérn process in the next section.

Spectrum
As described in the Introduction, many real-world processes are found to exhibit power-law behavior over a broad range of frequencies. For a range of spectral slopes, the powerlaw spectrum corresponds to that of a Gaussian random process 6 called fractional Brownian motion (fBm), introduced by Mandelbrot and Van Ness (1968). While the spectrum of fBm is not defined in the usual sense due to its nonstationarity, an expanded version of the notion of a spectrum, dis-cussed in Section 3.3 and denoted as S zz (ω), is found to yield for fBm the form (Flandrin, 1989;Solo, 1992) where α will be called the slope parameter, and with A setting the spectral level. Fractional Brownian motion is a generalization of classical Brownian motion-corresponding to the case α = 1 and therefore to an ω −2 spectrum-for which the slope parameter α can take a range of non-integral values. It is clear that a process having a spectrum proportional to |ω| −2α for α > 1/2 will be singular at zero, and will integrate to an infinite value, thus possessing neither a finite diffusivity nor a finite variance. Both the variance and the diffusivity of fBm will be found to increase without bound. Examples of complex-valued fractional Brownian motion are shown in Fig. 4. Here nine curves are shown for nine different values of α, varying from just greater than 1/2 to just less than 3/2. The decrease in the degree of roughness as α increases, and the spectral slope becomes more steep, is readily apparent in the figure. This occurs due to the fact that larger values of α correspond to stronger degrees of 'filtering', with steep spectral slopes removing highfrequency contributions to variance. Because we are considering that z(t) represents a velocity z(t) = u(t) + iv(t), this figure shows plots of u(t) versus v(t), as opposed to the trajectories that would arise from temporally integrating these quantities.
The main goal of this section is to utilize fBm to understand the implications of the slope parameter α. It will be found that for fractional Brownian motion, α has several intuitively distinct but partly corresponding interpretations: it is directly linked to the temporal decay of the autocovariance function; it controls the aspect ratio of rescaling for self-similar behavior; it sets the fractal dimension or degree of roughness; and it determines the degree of persistence or anti-persistence of a differenced version of the process, see Appendix F. Note that in the fBm literature, the slope parameter α is conventionally replaced with H = α − 1/2, referred to as the Hurst parameter, with 0 < H < 1, in terms of which the fBm spectrum is given by S fBm zz (ω) = A 2 /|ω| 2H+1 . There are compelling reasons to work with the slope parameter α rather than the Hurst parameter H. While spectral slope could be characterized in the vicinity of any frequency, the Hurst parameter is, strictly speaking, a measure of the long-time process range or memory. That is, H is a limiting quantity pertaining to the behavior of the process at very large timescales. As pointed out by Gneiting and Schlather (2004), the self-similarity of fBm implies that the large-scale behavior (memory) and small-scale behavior (fractal dimension) must be linked. However, for stochastic processes more generally, no such link between large and small scales is required. The spectral slope is therefore more appropriate when showing the connection of fBm to its damped version, the Matérn process, which is a short-memory process. Fur-thermore, because the appearance of the Matérn process as damped fractional Brownian motion is most clear in the frequency domain, it is sensible to work with a parameter that makes the spectral form simple.

The fBm autocovariance function
Fractional Brownian motion is defined in terms of a stochastic integral equation, which will be presented later in this section. This stochastic integral equation leads to a nonstationary autocovariance function given by (Mandelbrot and Van Ness, 1968) where V α is a normalizing constant defined shortly. The exponents take on values in the range 0 < 2α−1 < 2 due to the fact that 1/2 < α < 3/2. Thus the dependence of R fBm zz (t, τ ) on t and τ varies from being relatively flat, near α = 1/2, to relatively steep, near α = 3/2.
Observe that fractional Brownian motion is nonstationary-its autocovariance is a function of "global" time t as well as the time offset τ . Most significantly, the variance of fBm is which increases without bound; the longer one waits, the larger the expected amplitude of variability becomes. The time-varying fBm diffusivity is found to be as we readily find by integrating the autocovariance as in (B4). Like the variance, the diffusivity tends to increase without bound, rather than taking on a constant value. Note that the ratio of the diffusivity to the variance increases linearly with time, κ(t)/σ 2 (t) = 1 4 |t|(α + 1)/α. The normalizing constant in fBm, conventionally denoted V α , is defined as the variance at time t = 1 of an fBm process having the amplitude parameter A set to unity, Its value is found to be ( Barton and Poor, 1988) where Γ(x) is the gamma function. We find in Appendix E that this constant can be cast in the more symmetric form Figure 4. Plan view of realizations of complex-valued fractional Brownian motion z(t), for nine different values of the slope parameter α, as indicated in the legend. All nine time series have been set to have unit sample variance, and the real part of each curve is offset by a value of −3 from that for the next lower value of α. The smallest value of α, corresponding to the least smooth process, is at the right. The data aspect ratio is equal between the real and imaginary parts.
which allows one to see behavior of this coefficient more clearly. Recall that Γ(x), while positive for positive x, is negative in the interval (−1, 0), as follows from the reflec- Thus V α is positive over the whole permitted range of α, 1/2 < α < 3/2, but becomes unphysically negative as one passes outside of this range. Because the gamma function has a singularity at zero, with Γ(x) tending to positive infinity as x approaches zero from above, V α also tends to positive infinity as one approaches the two endpoints α = 1/2 and α = 3/2. Finally, from Γ(1/2) = √ π and Γ(2) = 1, the value of the coefficient for the Brownian case of α = 1 is found to be V 1 = 1.
In addition to the autocovariance function, it is informative to also examine a related second-order statistical quantity, where {·} denotes the real part. This quantity is commonly known as the variogram in time series analysis and geostatistics, following Cressie (1988) and Matheron (1963); in the turbulence literature, the same quantity is widely used and is known as the second-order structure function, a term which dates back at least to the 1950's (Monin, 1958). For a stationary random process, the variogram becomes simply γ zz (t, τ ) = γ zz (τ ) = σ 2 − {R zz (τ )}. Thus in the stationary case, the variogram merely repeats information already present in the autocovariance function. For fractional Brownian motion, cancellations in the variogram occur and one obtains which is independent of global time t. Thus unlike its autocovariance function, the variogram of fBm is stationary. This equation states that the expected squared difference between fBm values at any two times is proportional to a power of the time difference, implying that the expected rate of growth of the fBm from its current value is independent of t. One might therefore say that fBm is nonstationary, but in a timeindependent or stationary manner. A process having a stationary variogram is said to be intrinsically stationary (Ma, 2004).

Linking the spectrum and autocovariance
Owing to its nonstationarity, the fBm autocovariance cannot be Fourier transformed in the usual way to yield a spectrum that is independent of global time t. Evidently the notion of what it means to be a Fourier transform pair must be generalized to accommodate the time-dependent autocovariance. That the spectrum of fractional Brownian motion should be a power law of the form |ω| −2α was already conjectured by Mandelbrot and Van Ness (1968), based on earlier work by Hunt (1951) on the spectrum of its increments. Proving that this should be the case was accomplished by Solo (1992) using one approach, and by Flandrin (1989) and Øigård et al. (2006) using two variants of a different approach. Here, we essentially follow the latter paper, incorporating some additional details.
In general, the Fourier transform with respect to τ of a nonstationary autocovariance function R zz (t, τ ) defines a time-varying relative of the spectrum which, provided the integral on the right-hand side is well defined, is known as the Rihaczek (Rihaczek, 1968;Flandrin, 1999, p. 60-62) or Kirkwood-Rihaczek (Kirkwood, 1933;Hindberg and Hanssen, 2007;Øigård et al., 2006) distri-bution, or alternatively as the time-frequency spectral density (Hanssen and Scharf, 2003). If one averages the Rihaczek distribution across global time in moving windows of length T , then takes the limit as T approach infinity, one obtains where S zz (ω) is a time-averaged spectrum of a potentially nonstationary process. Observe that for stationary processes, R zz (t, τ ) and therefore S zz (t, ω) are independent of the global time t. In this case, S zz (t, ω) reduces to the usual Fourier spectrum S zz (ω), the time average in (22) has no effect, and S zz (ω) is therefore also identical to the usual Fourier spectrum S zz (ω). Thus S zz (ω) is a generalization of the usual Fourier spectrum, to which it reduces in the stationary case, that may be used to describe nonstationary processes.
For fractional Brownian motion, the Rihaczek distribution was stated by Øigård et al. (2006) with δ(t) being the Dirac delta function; see Appendix D for details of the derivation. The time-averaged version of the fBm Rihaczek distribution, defined as in (22), is given by and in the limit as the averaging time T approaches infinity, we have the time-averaged nonstationary spectrum for fBm, where all terms dependent on global time t are found to vanish. This determines a sense in which the power-law form is the correct spectrum to associate with nonstationary fractional Brownian motion. In the approach of Flandrin (1989), a different, but related, time-varying generalization of the spectrum is used instead of the Rihaczek distribution, but leading to the same power-law form for the time-averaged spectrum. This approach to proving that the power-law form is the correct spectrum to associate with fBm may be critiqued on the grounds that taking the limit of an average of the timefrequency spectral density, while mathematically sensible, does not correspond well with a limiting action that occurs in actual practice. Solo (1992) took a different approach, and found that if the expected autocovariance and spectrum are estimated from a sample over a finite time interval, the power-law form again emerges in the limit as that the time interval tends to infinity. That proof therefore has a strong intuitive appeal, but is more involved than the argument presented here.

Self-similarity
The most striking feature of fBm is that it is statistically identical to rescaled versions of itself. To show this, we define a time-and amplitude-rescaled version of z(t) as where the amplitude rescaling has been chosen to depend upon β as well as the slope parameter α. From (13), one finds and the autocovariance function of the rescaled process is determined to be the same as that of the original process. Because the original, unrescaled fBm process is Gaussian as well as zero mean, its statistical behavior is completely characterized by its autocovariance function. Thus fBm is statistically identical to itself when we "zoom in" in time, provided we also magnify the amplitude appropriately. This property was referred to as self-similarity in the original work of Mandelbrot and Van Ness (1968); although later the term self-affinity was suggested as a substitute (Mandelbrot, 1985), the original term appears to be in more widespread use.
The positive constant β can be seen as a temporal zoom factor, while the coefficient β α−1/2 describes how the amplitude is to be rescaled. Choosing β > 1 corresponds to zooming in in time, since then the interval from zero to β in the new processz(t) is drawn from the smaller interval zero to one in z(t/β). Similarly, β α−1/2 with α > 1/2 is greater than one, implying the amplitude must also be magnified. The required degree of amplitude magnification increases with α from a minimum value of unity at α = 1/2 to a value of β at α = 3/2. The slope parameter α therefore governs the aspect ratio of rescaling for this self-similar behavior.
An illustration of self-similarity is presented in Fig. 5, using the real parts of the nine realizations shown in Fig. 4. The two panels show the effects of the self-similar rescaling (28) on each time series with a zoom factor β = 4, with the zooming represented by the gray boxes. The boxes on the left, of different aspect ratios, are rescaled according to the law (28) to have the same aspect ratios, as shown on the right. It is clear that each of the nine curves presents the same degree of roughness, and same amplitude of variability, on the left as on the right. This demonstrates what is meant by statistical self-similarity, and shows how α controls the aspect ratio. A distinguishing feature of fractional Brownian motion is that this zooming may be continued indefinitely in either direction.
For stationary processes, self-similarity may also be seen in the frequency domain. Apply the rescaling (28) to some process z(t), which is now assumed to be stationary. From the Fourier representation of the autocovariance, one finds (30) after employing the change of variables ω/β → ω. Thus, in order for the process to be self-similar, one must have in the spectral domain. This would clearly be the case for the power-law spectrum S zz (ω) = A 2 |ω| −2α , if a stationary process with such a spectrum were to exist. More generally, if a process has an approximately power-law spectrum over a range of frequencies, then the self-similarity condition (31) is expected to be approximately satisfied over that range. In this sense a power-law spectrum implies self-similarity. Fractional Brownian motion is peculiar in that it has neither a well-defined derivative nor a well-defined integral. Loosely speaking, one may say that a derivative does not exist because the limiting action of taking a derivative conflicts with the self-similarity. Because z(t) exhibits variability at infinitesimally small scales, [z(t + ∆) − z(t)]/∆ does not have a well-defined limit as ∆ tends to zero. The integral t −∞ z(u) du does not exist either, because z(t) has unbounded variance as t progresses toward to infinitely large negative times and is therefore not integrable. Nevertheless, a differenced version of fBm does exist. This process, termed fractional Gaussian noise, is discussed for completeness in Appendix F.

Fractal dimension
The property of self-similarity, which is global in nature, was shown in the previous section to be related to the spectral slope. The slope is also related to two local properties, one associated with the slope at small frequencies, or the behavior of the autocovariance at large time offsets, and one associated with the slope at high frequencies, or the autocovariance at small time offsets. The former property is the process memory or long-range dependence discussed in Section 2.3, while the latter is the fractal dimension. While we view spectral slope as the more physically meaningful quantity, its relationship to fractal dimension is here discussed for completeness.
Fractal dimension is a measure of the dimensionality of a curve (or some higher-order surface) that accounts the effect of roughness (Mandelbrot, 1985;Falconer, 1990). There are several different measures of fractal dimension in use, giving sometimes different values of dimensional measure for a particular curve (see e.g. Mandelbrot, 1985;Taylor and Taylor, 1991;Dunbar et al., 1992). The most well-known measure, the Hausdorff dimension, is related to the behavior of the autocovariance function or variogram at very short timescales. One must also distinguish between the dimension of a curve as a function of the time variable, as in u(t) = {z(t)} versus t, and the dimension of a curve such as z(t) = u(t)+iv(t) in space or u(t) versus v(t), see e.g. Qian (2003). In the literature, the former is known as a graph, and the latter as a sample path.
The dimension of the graph is closely related to the shorttime behavior of the autocovariance. As described by Gneiting and Schlather (2004), for a univariate (or real-valued) stationary process u(t) that has an autocovariance function behaving as |τ | ρ for some 0 < ρ ≤ 2 as τ → 0, the Hausdorff dimension of the graph of the process is given by D = 2−ρ/2. The comparable result for intrinsically stationary Gaussian processes such as fBm is provided by Adler (1977). For fBm, ρ = 2α − 1, hence the dimension of the graph of (realvalued) fBm is D = 5/2 − α. This varies from D = 1 for the smoothest processes having α = 3/2, to D = 2 for the roughest processes with α = 1/2, corresponding to the bottom-totop progression seen in Fig. 5.
As pointed out by Gneiting and Schlather (2004), the selfsimilarity of fBm links the behavior at very large scales and very small scales together. Because for fBm the spectral slope is constant, the fractal behavior at small scales implies a singularity in the spectrum at the origin. This is associated with unbounded diffusivity, and since this singularity is not integrable, with unbounded variance as well. The Matérn process examined in the next section has an additional degree of freedom compared to fBm, such that the spectrum transitions to flat values for sufficiently low frequencies. This decouples the fractal dimension from the low-frequency behavior and permits the phenomenon of diffusivity to arise. Figure 5. A demonstration of self-similarity for fractional Brownian motion, using the realizations presented in Fig. 4. The real part of each process is shown, with the y-axes in this figure corresponding exactly to the x-axis in Fig. 4. The gray boxes in panel (a) illustrate the different scaling behaviors, as described by (28). When each process is rescaled such that the boxes in (a) are transformed to the boxes in (b), the resulting time series are statistically identical to the originals. Thus the rescaled curves in (b) present the same degree of roughness as the corresponding curves in (a). The temporal magnification factor is β = 4, while the amplitude magnification factor β α−1/2 varies from 1 at α = 1/2 to 4 at α = 3/2. In order to avoid the appearance of additional roughness in (a) due only to numerical resolution, only every fourth point in (a) is shown; thus the curves in (a) and (b) consist of the same number of points.

Stochastic integral equation
Fractional Brownian motion is defined via the stochastic integral equation (Mandelbrot and Van Ness, 1968) where dW (t) here are increments of the complex-valued Wiener process, the covariance of which between itself at two different times is The integration with respect to dW (s) indicates in (32) that these integrals are of the Riemann-Stieltjes form, see Percival and Walden (1993). The process dW (s) can be said to represent continuous-time white noise, thus this equation defines fBm as a weighted integral of white noise. Because in (32) one may exchange the order of the integral and the expectation operator, and dW (s) is zero mean, z(t) is also zero mean. As dW (s) is Gaussian and z(t) is a linear combination of Gaussian random variables, z(t) is also Gaussian. Thus z(t) inherits both zero-meanness and Gaussianity from the increments of the Wiener process. Further intuitive content of (32) is not initially apparent, so we will take some time to examine it in detail. Note that standard Brownian motion, corresponding to α = 1, is defined for all t as in which the integral is interpreted as z(t) = −A 0 t dW (s) for t < 0. This is simply the temporal integral of white noise. The fBm definition (32) reduces to the Brownian form with α = 1, with the first term in (32) vanishing.
The stochastic integral equation (32) can be written in the somewhat more transparent form where I(t) is the indicator, or unit step, function defined as The purpose of the second term in (35) is now clearly seen to set the initial condition. It is not a function of time; it is simply a random number, chosen to set z(0) = 0 identically. Note that the two components of (35) cannot be written as separate integrals, because writing them as two separate integrals would mean that two different realizations of dW (s) are involved. The two terms in (35) must be based on the same realization of dW (s) in order to achieve the initial condition z(0) = 0; this is not true for the two terms in (32), which correspond to two different intervals of integration.
The weighting factors such as (t − s) α−1 in (32) may be seen as creating a fractional integral of the Wiener process, as will now be shown. There is a simple expression for a function f (t) that is integrated n times from some initial point a to time t, an action that is the reverse of the repeated derivative (d n /dt n )f (t). This formula, known as Cauchy's formula for repeated integration, states t a τ1 a · · · τn−2 a   τn−1 a f (τ n ) dτ n   dτ n−1 · · · dτ 2 dτ 1 (37) meaning that one may collapse an integral that is repeated n times into a single integral, with a weighting to the (n − 1)th power. Note that applying (d n /dt n ) to both sides, one obtains f (t) = f (t)-the left-hand side by repeated applications of the fundamental theorem of calculus, and the righthand side by repeated applications of the Leibniz integral rule.
While the left-hand side of the Cauchy integral formula is not interpretable for non-integer α, the right-hand side remains valid. This allows us to define a fractional integral of f (t) by letting n take on non-integer values in the right-handside of (37). According to this reasoning, the quantity is known as the Riemann-Liouville fractional integral, and may be said to integrate the function f (t) a fractional number of times α. For further details on fractional calculus, see e.g. Gorenflo and Mainardi (1997).
Returning to the definition of fBm in (32), we now see that it is simply a fractional integral of continuous-time white noise, modified to have the initial condition z(0) = 0. Unlike standard Brownian motion (34), which is integrated only from time t = 0, for fractional Brownian motion one integrates from the infinite past in order to obtain the desired statistical behavior, and then one offsets this process by the correct amount in order to set the desired initial condition.

The Matérn process
The previous section reviewed the properties of fractional Brownian motion, including its self-similarity and fractal dimension, and showed how these are related to the spectral slope. This section examines the Matérn process in detail, with a focus on its relationship to fBm. A simple extension, the inclusion of a 'spin parameter', generalizes the Matérn process to encompass a larger family of oscillatory processes that are shown to represent forced/damped fractional oscillators.

The Matérn process and its spectrum
In Section 2 we showed that fractional Brownian motion is unable to capture long-time diffusive behavior, and demonstrated that this was a deficiency for the particular application to modeling particle velocities in two-dimensional turbulence. Regarding the spectra in Fig. 3, one sees a highfrequency power law slope but a low-frequency plateau. This leads us to consider a spectrum of the form which is the spectrum of a type of stationary random process known as the Matérn process (Matérn, 1960;Guttorp and Gneiting, 2006). Unlike fBm, the Matérn process is defined for all α > 1/2 and not just in the range 1/2 < α < 3/2. Compared with fBm, the Matérn spectrum incorporates an additional (non-negative) parameter λ having units of frequency, which will be shown to have the physical interpretation of a damping. Note that the form of the Matérn spectrum also generalizes that of the Ornstein-Uhlenbeck process, corresponding to the α = 1 case, to fractional orders (Wolpert and Taqqu, 2005;Lim and Eab, 2006). Examples of simulated Matérn processes are shown in Fig. 6, for twelve different values of α and three different values of λ. The box indicates a very low-damping regime with 1/2 < α < 3/2, roughly corresponding to the fractional Brownian motion realizations seen in Fig. 4. There are two important differences when compared to fBm. The first is that there is no upper bound on α, so the spectral decay can become even steeper than for the α = 3/2 case that defines the upper limit of the slope parameter for fBm. The second is the role of the additional parameter λ. As this parameter is increased, the curves for any α value appear more and more like white noise.
The damping parameter λ thus emerges as controlling the transition between two distinct spectral regimes. The Matérn spectrum is observed to have two limits so that, for high frequencies, an fBm-like power-law decay is recovered, while for low frequencies the spectrum approaches a constant. The spectrum may therefore may be said to be locally white (or constant) for small |ω|/λ; that is, the spectrum is not constant over all frequencies, or globally white, but it is approximately white for sufficiently low frequencies. The Matérn process thus provides a continuum between the two regimes of white noise and a power-law spectrum, with a transition dictated by the value of λ. Equivalently, λ gives the approximate timescale at which the process begins to exhibit self-similar behavior, as one zooms out from very large timescales. It follows that in real-world applications, the sampling interval must be sufficiently small compared to λ in order to resolve the self-similar behavior. The theoretical spectra corresponding to the realizations in Fig. 6 are shown in Fig. 7a. When frequency is normalized by the damping parameter, the theoretical (as opposed to the sampled) spectra for the different λ values become identical. A transition in the vicinity of ω/λ = 1 is readily apparent. The different spectral levels reflect the choice of normalization, which is that σ 2 has been set to unity. Smaller values of α, corresponding to slower decay, therefore appear with lower spectral levels in order to integrate to unit variance.
To examine the role of λ as a transition frequency, we take the derivative of the logarithm of the spectrum, and obtain and note that d 2 dω 2 ln S M zz (ω) vanishes at |ω| = λ, so the rate of change (42) obtain an extremum at that frequency. The third derivative d 3 dω 3 ln S M zz (ω) is positive at |ω| = λ, indicating that this extremum of d dω ln S M zz (ω) is a minimum. Thus the parameter λ gives the frequency at which ln S M zz (ω) is decreasing most rapidly with increasing |ω|, a natural choice to designate the transition between the energetic "white" regime at low frequencies and the decaying regime at high frequencies. Since d dω ln S M zz (ω) = d dω S M zz (ω) /S M zz (ω), |ω| = λ is the frequency at which the fractional decrease in S M zz (ω) is largest.
The variance and diffusivity of the Matérn process are both finite, and are found to be given by in which we have introduced the normalizing constant where B(x, y) ≡ Γ(x)Γ(y)/Γ(x + y) is the beta function. The value of the diffusivity is found from κ = S zz (0)/4, see (9), together with the Matérn spectrum form in (39), while the variance is (45) after the change of variables ω 2 → xλ 2 . Applying one of the defining forms of the beta function, e.g. Gradshteyn and Ryzhik (2000, 3.194.3), then leads to the variance expression given in (43). The Matérn spectrum can be rewritten in terms of the variance σ 2 as so that the diffusivity becomes κ = 1 4 σ 2 /(λc α ). In this form, the Matérn spectrum becomes a function of σ 2 , α, and λ rather than A 2 , α, and λ. This will prove to be more convenient for numerical optimization during parameter fitting, because reasonable ranges for σ are more readily determined than are ranges of A. This re-parameterization also simplifies somewhat the form of the autocovariance function, presented next.

The autocovariance function
The autocovariance function corresponding to the spectrum (39) is found to be (Matérn, 1960;Guttorp and Gneiting, 2006) where for notational convenience we have introduced the Matérn function as a modified version of the K α−1/2 (x), the decaying modified Bessel function of the second kind of order α − 1/2. Integral relation 17.34.9 given on p. 1126 of Gradshteyn and Ryzhik (2000) may be rearranged to give for α > 0 and λ > 0, verifying that (48) is the inverse Fourier transform of (47). The cosine integral version of this result is sometimes known as Basset's formula, see Watson (1922, p. 172), who states the case of integer α is originally due to Basset (1888, p. 19), and who also discusses some history of the integral on the right-hand-side. Examples of theoretical Matérn autocovariance functions are presented in Fig. 7b, again corresponding to the realizations in Fig. 6. As is usual with Fourier pairs, the most localized spectra correspond to the most distributed autocovariance functions, and vice-versa. As α decreases, the autocovariance falls off more and more quickly from the origin, with a singularity developing at the origin as α approaches one-half.
The asymptotic behavior of the Matérn covariance for large and small times is as follows. For |τ | 1/λ, one has the behavior as follows from the asymptotic behavior of the modified Bessel function for large argument (Abramowitz and Stegun, 1972, 9.7.2). Thus the Matérn process exhibits exponential decay of its covariance function, and is therefore categorized as a short-memory process. For time offsets that are small compared to the damping timescale, |τ | 1/λ, and for the slope parameter in the range 1/2 < α < 3/2, one finds  (61), respectively. As in Fig. 6, the various α values are shown by alternating black and gray lines, with α = 2 shown as a heavy black line. Time and frequency have been nondimensionalized as τ λ and ω/λ, respectively; thus the transition between a flat and a sloped regime occurs in the vicinity of ω/λ = 1 in (a), while the e-folding time in (c) is τ λ = 1. The autocovariance function (b) develops a strong singularity as α approach 1/2, which is linked to the flattening of the spectrum in (a). The Green's function in (c) is infinite at τ λ = 0 for α < 1, and vanishes at τ λ = 0 for α > 1.
as the short-time behavior of the Matérn autocovariance function. This is derived in Appendix G following Goff and Jordan (1988, their Appendix A), who were apparently the first to establish it, see Guttorp and Gneiting (2006). It is also shown in Appendix G that for α > 3/2, the lowest-order dependence of the Matérn autocovariance function no longer contains a power of α, but instead remains proportional to τ 2 as α increases.
The expression (52) for the short-time behavior of the Matérn autocovariance may be simplified by noting which relates V α , the coefficient of fractional Brownian motion defined in (18), to c α , the normalizing constant for the Matérn process defined in (44). These two definitions together with the duplication formula for the gamma function (E5) presented in Appendix E lead to the above result. Substituting this into the asymptotic expansion (52) for small |τ |, we obtain for 1/2 < α < 3/2 after making use of the expression for the Matérn variance given by (43). This matches exactly the τ -dependence inferred for a power-law spectrum inferred in Appendix D using a limiting argument. Note that the only dependence on λ of the autocovariance for |τ | 1/λ is through the variance σ 2 .
From this small-τ expansion, we can immediately determine the fractal dimension, as discussed in Section 3.5. One finds so that the fractal dimension decays from D = 2, for very rough processes with α = 1/2, to D = 1, for smooth processes with α = 3/2, just as with fractional Brownian motion. For slopes steeper than ω −3 , the fractal dimension remains at unity. This is a consequence of the fact that for α > 3/2, the highest power of τ appearing in the small-τ expansion (52) is τ 2 .

Inclusion of spin
A very simple modification can expand the range of possibilities of the Matérn process, and also aid in the development of physical intuition. We add a deterministic tendency for the process to spin on the complex plane at rate Ω, and refer to this new process as the oscillatory Matérn process or oMp. Modulating the Matérn autocovariance R M zz (τ ) by e iΩτ gives for the new autocovariance function / spectrum pair. Note that with α = 1, these reduce to where we have made use of 10.2.17 on p. 444 of Abramowitz and Stegun (1972) for the former equality. These are observed to be the autocovariance and spectrum of the complex-valued oscillator known as the complex Ornstein-Uhlenbeck process (Jeffreys, 1942;Arató et al., 1999). Thus the oscillatory Matérn process subsumes the Matérn process and the complex Ornstein-Uhlenbeck process into a larger family. In this next section we will determine the stochastic integral equation of this oscillatory Matérn process.

Stochastic integral equation
Unlike fractional Brownian motion, the Matérn process is not generally defined in terms of a stochastic integral equation or a stochastic differential equation. A stochastic integral equation that will generate an oscillatory Matérn process is where the Green's function, or impulse response function, is Note that the Green's function has been set to vanish before time t = 0, thus corresponding to a causal filter. The Fourier transform of a Green's function g(t) is an important quantity known as the transfer function, and we find for the Matérn transfer function, using 3.2.3 on p. 118 of Bateman (1954). In terms of the Green's function, the autocovariance function is given by (63) with the last expression following from the orthogonality property of the Wiener increments (33), together with a change in the variable of integration. From the familiar crosscorrelation theorem it then follows that spectrum of the process generated using the Green's function (61) matches that for the oscillatory Matérn process (57). 7 Examples of the Green's functions for Ω = 0 are shown in Fig. 7c. Note a change in behavior across α = 1. For higher values of α, the Green's function vanishes at τ = 0, thus developing a maximum that is seen to shift away from the origin as one increases α. For α < 1, however, a singularity develops at the origin, and the Green's function monotonically decays with increasing time.
Identifying this stochastic integral equation sheds light on the nature of the Matérn process itself. The Green's function g(t) defined in (61) is also the solution to an impulse forcing of the damped fractional oscillator equation as shown in Appendix H. This establishes the physical interpretation of the oscillatory Matérn process as a damped fractional oscillator forced by continuous-time white noise. The standard Matérn process is then seen as a forced/damped fractional oscillator in which the oscillation frequency is set to zero. Note that here we have avoided attempting to write the Matérn process as a stochastic differential equation, as there are mathematical difficulties in ensuring that the fractionalorder derivatives exist. 8 The approach we have taken, comparing the impulse response function (61) for the Matérn stochastic integral equation (60) with that for the deterministic fractional differential equation (65), is intended to determine the physical nature of the system while sidestepping such mathematical difficulties.
We can also now understand why λ is referred to as a 'damping'. In the α = 1 case, the oscillatory Matérn process becomes identical to the complex Ornstein-Uhlenbeck process, as previously mentioned. The Green's function for this process is e iΩt−λt for non-negative t, and zero elsewhere. This Green's function is also the solution to the first-order ordinary differential equation which is the equation for a damped, one-sided oscillator forced by a delta function. This equation appears, for example, in the study of oscillations of the ocean surface layer forced by the wind (Pollard and Millard, Jr., 1970), in which λ parameterizes a physical drag. In the Green's function, λ sets the timescale of the decay of the oscillations, and it therefore also controls the decorrelation time in the autocovariance function (58). In the spectrum (59), λ removes the singularity at ω = Ω, replacing it with a 'bump' that becomes more spread out as λ increases. All of these factors support interpreting λ as a damping for α = 1. For other values of α, we see that λ still controls the decay of the Green's function (61), the long-term decay (51) of the autocovariance function (56), and the spreading out of the singular peak at ω = Ω in the spectrum (57). In the fractional differential equation (65) as well, λ appears as a quantity that can trade off against the rate of change. Thus, for α = 1, the parameter λ still acts in a way that supports its identification as a damping.
As shown in the next section, if the damping vanishes, the stochastic integral equation for the Matérn process becomes identical to that for fractional Brownian motion, apart from a modification that sets the initial condition for fBm.

Relationship to fractional Brownian motion
Having identified the stochastic integral equation for the Matérn process, we now examine its relationship with fractional Brownian motion. The Green's function of the oscillatory Matérn process (61) can be rewritten as where I(t) is the indicator function defined in (36), and where we explicitly specify the dependence of g(t) upon the Matérn parameters. In terms of this Green's function, the stochastic integral equation defining fBm (35) becomes in which g α,0,0 (t) = 1 Γ(α) I(t) t α−1 . The only difference between this and the equation for the undamped, nonoscillatory Matérn process (60) is the second term in the integral, which as shown earlier, serves the function of enforcing the initial condition z(0) = 0. This confirms that the standard Matérn process with λ > 0, and consequently with a Green's function of the form g α,λ,0 (t) = 1 Γ(α) I(t) t α−1 e −λt , is rightly thought of as damped fractional Brownian motion.
If fractional Brownian motion and the standard Matérn processes are essentially facets of the same process, one should be able to see this directly from their autocovariances. This is indeed the case. For time shifts τ that are very small compared to the global time t, the fBm autocovariance (13) is approximately given by (70) where σ 2 (t) ≡ R fBm zz (t, 0) = V α A 2 |t| 2α−1 is the timevarying fBm variance encountered earlier in (14). This matches (54) for the Matérn autocovariance at small |τ |/λ.
The intuitive interpretation of this result is that a Matérn process has a second-order structure that behaves for small time offsets τ in the same way as does fractional Brownian motion, considered for offsets τ that are small compared with the current global time t. Or, even more succinctly, the local behaviors of the Matérn process and fBm are the same; they differ from each other only for sufficiently large time offsets.
To look at this another way, imagine that a modified Matérn process were constructed with an integral matching the form of that for fractional Brownian motion (69). In other words, we define z(t) as in (69) but for arbitrary values of λ. Such a process would then by definition have z(0) = 0, and would therefore not be stationary. For nonzero λ, after a sufficiently long time this initial condition is 'forgotten' on account of the decaying exponential in the Green's function, and the process will eventually behave as if it were stationary. For λ = 0, however, this initial condition is never forgotten.
The qualitatively significant difference between the Matérn process and fBm-that the former is stationary, while the latter is non-stationary-can be seen as a consequence of the lack of damping in the latter case. In applications, we believe it would be unphysical to observe a process that remains nonstationary for all timescales. Rather, for sufficiently long observational periods, it is more likely that the process will eventually settle into stationary behavior. For the Matérn process, this occurs when the observational window is sufficiently long compared with the decay timescale λ −1 . Another difference is that the value of fBm at time t = 0 is fixed to an exact value of zero, while that of the Matérn process is random. However, since it is common practice to remove the sample mean prior to analyzing a data time series, and/or to add a constant offset to a generated process, this distinction makes little practical difference for applications such as the one presented here.

Generation
This section addresses means to simulate realizations of fractional Brownian motion and the Matérn process numerically. The main contribution is a new approach to simulating a diffusive process such as the Matérn in O(N log N ) operations, by relying on the knowledge of its Green's function. Readers not interested in these numerical details may feel free to proceed to the application in Section 6.

The Cholesky decomposition
The standard approach to simulating a Gaussian random process with a known covariance matrix is a method called the Cholesky decomposition, which we discuss here. In this section, as we will be dealing with vectors and matrices, a change of notation is called for. We now let z n ≡ z(n∆) with integer n denote a discretely sampled random process, sampled at N times separated by the uniform interval ∆.
This sequence is arranged into a length N random column vector denoted z. We define the expected N × N covariance matrix of z as R ≡ E zz H , where the superscript "H" denotes the conjugate transpose, having components Here n∆ plays the role of global time t, and (m − n)∆ that of the time offset τ , in the evaluation of the nonstationary covariance function R zz (t, τ ) = E {z(t + τ ) z * (t)}. Thus variation in R of the time offset τ with fixed global time t occurs in the direction perpendicular to the main diagonal, while variation of t with fixed τ occurs along the main diagonal.
In the case of a stationary process, there is no variation parallel to the main diagonal, and R is then said to be a Toeplitz matrix. The Cholesky decomposition factorizes the covariance matrix as R = LU, where L is lower triangular and U is upper triangular. It follows from the Hermitian symmetry of R that L = U H . Now let w be an N -vector of unit-variance, independent, complex-valued Gaussian random variables. Forming the sequenceẑ = Lw, we find the covariance matrix R ≡ E ẑẑ H associated withẑ is given by where I is the N × N identity matrix. Note while we could have also chosen to use U to generate the random sequence, the use of L is more natural as it corresponds to a causal filter. Thus to simulate a length N sequence of a possibly nonstationary Gaussian random process, one simply populates an N ×N matrix with the known values from the autocovariance function, applies the Cholesky decomposition to generate a lower triangular matrix, and multiplies the result by a vector of white noise. The resulting sequence has the identical covariance structure to a length N sample of the original random process.
A limitation of this approach is that the Cholesky decomposition requires, in its most straightforward implementation, O(N 3 ) operations. Computational costs therefore increase steeply with increasing N . However, it is the case that many realizations of sequences of a fixed length can be generated quickly, because one only needs to form the Cholesky decomposition once for a given autocovariance matrix. For simulation of stationary processes, the Toeplitz matrix structure can in principle be used to accelerate the Cholesky decomposition to O(N 2 ) or even O (N log N ), see Yagle and Levy (1985) and Dietrich and Newsam (1997) respectively. The latter method, termed circulant embedding, while O (N log N ), involves embedding the covariance matrix of interest within a larger matrix, and may lead to somewhat unpredictable tradeoffs between minimizing error and increasing the matrix size (Percival, 2006). The method presented here has the advantages that it is very straightforward to implement, and that the error terms are well understood provided the Green's function is known.

Discretization effects in fast generation
To devise our generation method, we will first renormalize the Green's function so that we may use σ rather than A to parameterize the process amplitude. A modified Green's function is defined as where the subscripts ong(t) will be dropped unless explicitly needed. The stochastic integral equation for the Matérn process (60) then becomes recalling that σ and A related by σ 2 = c α A 2 /λ 2α−1 . Next we introduce a temporal spacing∆ ≡ ∆/k that is finer than the sampling interval ∆, where k is a positive integer termed the oversampling parameter. We then have by splitting the integral in (74) into contributions from smaller integrals over segments of duration∆. Here we have replaced the upper limit of integration with t, asg(t−s) vanishes for negative values of its argument.
For each of these integrals over a short segment, we approximate the Green's function by a constant, namely the value of the Green's function at the segment midpoint, which occurs when t−s = (p+1/2)∆. Employing this approximation and evaluating the result at the discrete times t = n∆ defines a discrete series for all integers n = −∞, . . . , −2, −1, 0, 1, 2, . . . ∞. Because b a dW (s) is a zero-mean Gaussian random variable with variance (b − a), the integral in the above expression simplifies to where w n , defined for integer n, is a sequence of complexvalued, unit variance, independent Gaussian random variables. Introducing an oversampled version of the discrete Green's function as our expression (76) forz n becomes This is a discrete convolution, but modified by the fact that the output will have a temporal resolution that is k times more coarse than that of the two input series. The numerical evaluation of the oversampled Green's function can be simplified by noting the behavior ofg(t) with respect to a rescaling of the time axis by some factor r, which replaces the time rescaling with a rescaling of the damping λ and frequency shift Ω, resulting in a cancellation of the factor ∆/k. The autocovariance function ofz n is very close to the sampled autocovariance function of the Matérn process, and can be made arbitrary close by a suitable choice of oversampling rate k, as will now be shown. The autocovariance sequence associated withz n is found to be and since E{w m w * n } = δ m,n where δ m,n is the Kronecker delta function, all terms in the summation vanish except for when mk−p = (m−n)k−q or equivalently q = p−nk. Thus which is clearly an approximation to (63) for an autocovariance function in terms of its Green's function. The discretely sampled autocovariance sequence can therefore be approximated to arbitrary precision by a choosing a suitable degree of oversampling. However, notice that the summations in (79) and (83) extend to infinity, which is not possible in practice. In the next subsection we examine the impact of additional errors resulting from finite sample size effects.

Sample size effects in fast generation
In practice, the summations over the duration of the Green's function must be truncated at some point. It is tempting to truncate the Green's function after a relatively short time. However, for spectra having a large dynamic range, this truncation leads to undesirable leakage effects, just as in spectral analysis, that degrade the spectrum of the generated sequences. Instead, we will utilize a Green's function that is longer than entire length of the time series. Firstly we need to determine a suitable cutoff for limiting the long-term influence of the Green's function. We denote by T the time such that the magnitude of the Green's function, integrated to this time, rises to within a fraction of the value it obtains when integrated over all times: Using the definition of the Matérn Green's function (61), one may readily show that this occurs when where γ(α, t) is the incomplete gamma function of order α evaluated at time t.
Anticipating transforming to the Fourier domain, we will define sequences that are periodized. Because we intend to employ a periodic convolution, yet wish to prevent noise values at the end of the time series from influencing the beginning, we will create a longer sequence of length N ≡ N +N where N ≡ ceil(T /∆) with ceil(·) being the ceiling function. Let w n be a version of the noise that is periodic with period N , and g {k} n be a version ofg {k} n that is set to zero for n > N − 1. Form a length-N vectorẑ with entries given bŷ and now decompose this vector into two parts,ẑ = [ẑ ẑ o ] T where the superscript "T " is the transpose operator. In the initial portionẑ , of length N , the decaying Green's function is interacting with noise wrapped around from the end of the periodic noise sequence. This portion is discarded, while the second portionẑ o is of length N and is the simulated series we desire. The N × N covariance matrix associated with the latter sequence, R = E ẑ oẑ H o , has components given by To simplify this expression, observe that the covariance of the periodized noise sequence w n is with the sum indicating that the periodized noise is correlated with copies of itself from the future and the past. Thus in (87), all terms vanish except for when mk − p = (nk − q) + N or equivalently q = p − (m − n)k + N . We then have for the terms in the N × N covariance matrix R. Note that this consists only of the = 0 and = 1 terms from (88). The first term in (89) is due to the = 0 term. The second ( = 1) term arises from the Green's function interacting with a copy of itself shifted byN due to the periodization of the noise, and is expected to be much smaller than the first term. Note that contributions from negative do not appear due to the fact that g {k} n vanishes for negative n; but all contributions from > 1 also vanish because g {k} n has been truncated to vanish for n > N − 1.

Comparison of fast and Cholesky methods
The advantage to the Green's function approach is that (86) is a discrete, periodic convolution that can be implemented using a Fast Fourier Transform in O( N log N ) operations; if N ≈ N , this is approximately O (N log N ). In the numerical implementation described in Appendix A, we set = 0.01, such that T gives the time at which the timeintegrated Green's function reaches one percent of its total time-integrated magnitude. We also set the oversampling parameter k such that there will be at least 10 points per damping timescale λ −1 , which is accomplished by choosing k = ceil (10 × λ∆) since 1/(λ∆) is the number of sampled points in one damping timescale. These settings are observed to give fast but accurate performance for a broad range of parameters.
If desired, the matrix R m,n in (89) can be computed in order to explicitly check the errors in computing the covariance matrix, although this will necessarily slow down the algorithm. The terms in the true, discretely sampled autocovariance matrix are given exactly by (90) whereT = (N − 1)∆; this follows from the form of the Matérn autocovariance function in terms of the Green's function (63). We may observe that discretizing the first integral corresponds to the first summation in (89). There are therefore three error terms between R m,n and R m,n : errors associated with this discretization, which are minimized by choosing the oversampling rate k to be sufficiently large; and errors from the second integral in (90) and the second summation in (89), both of which are minimized by choosing N sufficiently large. Thus error can be computed by comparing the difference between the true discretely sampled autocovariance matrix R m,n and the autocovariance matrix R m,n that is satisfied by the process generated through the Green's function method. While this is numerically expensive, it need only be computed one time for a given set of parameters α, λ, N , k, and .
As an example, in Fig. 8 we present spectra of 25 samples of Matérn processes generated using both the Cholesky decomposition and the fast Green's function algorithm. The estimated spectrum for each realization is computed using Thomson's adaptive multitaper algorithm (Thomson, 1982;Park et al., 1987) using 15 orthogonal Slepian tapers having a time-bandwidth product of eight. The adaptive algorithm employs frequency-domain smoothing only to the extent that it can be achieved without the expense of broadband bias.
No substantial difference between spectra computed with the two different algorithms is seen over many decades of structure, indicating that fast algorithm is able to simulate the Matérn process to a very high degree of accuracy. In generating this plot for time series of length 1000, 2000, 4000, and 8000 points (as shown here), the Green's function method executes respectively 3, 7, 11, and 45 times faster than the Cholesky algorithm on a Mac desktop. Note that the Green's function method does not depend on any special properties of the Matérn process, apart from the particular definition of the cutoff time T for the initial time period (84). The method is therefore suitable for any Gaussian random process having a decaying and sufficiently smooth autocovariance for which the Green's function has an analytic expression. A more detailed comparison between the Green's function method of generation, and other methods such as circulant embedding (Dietrich and Newsam, 1997;Percival, 2006), is outside the scope of this paper, and is a natural direction for further work.

Application
This section presents the details of an application of the Matérn process to modeling particle velocities in a numerical simulation of two-dimensional fluid turbulence, a preview of which was presented in Section 2.4. Details of the numerical model are given in Section 6.1, the estimation of parameter values is discussed in Section 6.2, and the means by which realizations of the stochastic models are obtained is described in Section 6.3.

Numerical simulation of 2D turbulence
A system called forced-dissipative quasigeostrophic turbulence is created by integrating an equation for the streamfunction Φ(x, y, t). For nondivergent flows, the streamfunction is a scalar-valued quantity at each point giving the velocity components through U (x, y, t) = − ∂ ∂y Φ and V (x, y, t) = ∂ ∂x Φ. The equation to be integrated is where J(a, b) ≡ ∂a ∂x ∂b ∂y − ∂b ∂x ∂a ∂y is the Jacobian operator, L D is a spatial scale termed the deformation radius, F is a forcing function, and D is a damping. This equation is derived from a conservation law following particle trajectories. This simple system is considered an idealized representation of turbulence in planetary fluid dynamics, on scales large enough that the rotation of the planet is important, but not so large that the planet's curvature needs to be taken into account.
An integration of (91) is carried at 1024 2 resolution in a doubly periodic domain of dimension 2500 × 2500 km. As is typical in such problems, the forcing F consists of random fluctuations of a particular spatial scale imposed everywhere in the domain at each time step. A characteristic forcing scale of 117 km is chosen here such that the scale of the forcing is intermediate between the grid scale and the domain scale. The damping is chosen to take the form D = r∇ 2 Φ where r is set to 1.5 × 10 −8 s −1 . After an initial spin-up period, during which an equilibration of the energy level is achieved, the simulation is run for three years or 3*365=1095 days.
A snapshot of current speed from the first day of the simulation after the end of the spin-up period is shown in the left panel of Fig. 1. As mentioned previously, the circular areas of high speed represent long-lived vortices (see e.g McWilliams, 1990b;Scott and Dritschel, 2013), which are not the subject of this study. Instead we are interested in the behavior of particles that inhabit the spaces between the vortices.
The analysis here is based on a set of 1024 particle trajectories that are tracked throughout this experiment, shown in the right panel of Fig. 1. The trajectories are output at high temporal resolution, decimated to a six hour sampling interval, and first central differenced to produce velocities. Position and velocity records are then decimated again to daily resolution, which we find to be sufficient to capture meaningful variability. One-half of the trajectories are then discarded in order to exclude those most directly effected by vortices, as described next, leaving 512 trajectories of length 1095 to be analyzed.
The simplest way to remove the effects of vortices is simply to discard those trajectories which conspicuously exhibit the effects of vortex trapping. A common measure of the impact of vortices on a trajectory is the so-called spin parameter (Sawford, 1999;Veneziani et al., 2005b, a), defined as in which " " is the imaginary part. In our implementation, these time derivatives are adequately approximated by first central differences at daily resolution. The overbar here is a temporal average over the extent of a trajectory; note that since the mean velocity is zero, the denominator is the velocity variance along the trajectory. We take the modulus of the time-averaged spin, |Ω|, as a measure of the overall impact of vortices. Because of the long-term persistence of particles within vortices (see e.g. Pasquero et al., 2002), it is unlikely that a small value of |Ω| would result from cancellation of positive and negative contributions within the same time series for the three-year lengths we consider. Conservatively, we keep the half of the 1024 trajectories having the lower values of spin magnitude. The resulting 512 trajectories, offset to begin at the origin in Fig. 2a, exhibit a meandering character in addition to their dispersion. The omitted trajectories typically present a dense and regular looping structure, some of which may be seen in the right-hand panel of Fig. 1.

Frequency-domain maximum likelihood
This section describes the method by which the Matérn parameters are estimated from a finite data sample, which necessitates some new notation. In reality one only observes a random process z(t) at a finite set of discrete times z[n] = z(n∆) separated by the fixed time interval ∆, and with n = 0, 1, 2, . . . , N − 1. In this subsection, we use square brackets for time series which take discrete arguments, thereby distinguishing a discretely sampled time series z[n] from its continuous-time analogue z(t). Based on this sample z[n], one wishes to estimate the parameters of stochastic model, conventionally denoted by the vector θ, which in the case of the Matérn model is θ = (σ, α, λ).
A standard approach would be to form a parametric estimate using the maximum likelihood method implemented in the time domain. However, this method involves a computationally expensive matrix inversion, which becomes a limit-ing factor when analyzing large datasets. An alternative approach to estimating the parameters is to do so in the frequency domain using a method called the Whittle likelihood (Whittle, 1953). This approach is considerably faster than time-domain maximum likelihood, with O(N log N ) versus O(N 2 ) computational cost, yet is known to give approximately the same results. It also has the advantage of letting us only fit the parametric model over a specified band of frequencies. The Whittle likelihood method proceeds as follows. The discrete Fourier transform of the length N sequence z[n] is given by for m = 0, 1, 2, . . . , (N − 1). The squared modulus of this sequence of N Fourier coefficients, renormalized by 1/N , de-fines a spectral estimate known as the periodogram This is to be compared with the discretely sampled theoretical spectrum (47) for a particular value of the parameters θ where 2πm/(N ∆) is recognized as the mth Fourier frequency.
The model parameters are estimated by finding the value of θ that maximizes the so-called Whittle log-likelihood in which F is a set of integers indicating the Fourier frequencies over which the fit is to be applied. For example, F could be chosen to be m = 0, 1, 2, . . . , (N −1), in which case the fit will be applied to all frequencies.
In turns out to be the case that in the inference of parameters for a steep spectrum, such as we are dealing with here, this approach is inadequate as it ignores potentially significant effects associated with the finite sample size. In particular, spectral blurring associated with the periodogram can lead to quite incorrect slopes at high frequencies. Instead we use the de-biased Whittle likelihood method recently developed by Sykulski et al. (2016b). In that approach, the periodogram S zz [m] in (96) is replaced with a tapered spectral estimate, and the theoretical spectrum S θ zz [m] is replaced with the expected tapered estimate for a Matérn process characterized by the particular value of θ. The de-biased Whittle likelihood allows the parameters θ to be more accurately estimated, as it correctly accounts for the effect of spectral leakage as well as aliasing.

Stochastic model realizations
Here we give details on how the realizations shown in Fig. 2b-d have been created. First, in preparing Fig. 3, tapered spectral estimates as well as periodogram estimates are formed. As discussed in Section 2.4, for data tapers we use the lowest-order Slepian taper (Slepian, 1978;Thomson, 1982;Park et al., 1987;Percival and Walden, 1993) with the time-bandwidth product set to 10. The average over all time series, and over both sides of the frequency spectrum, are shown for both estimates. In contrast with the tapered estimates, the periodogram (not shown) is seen to accurately estimate the spectrum over only about half of the dynamic range. This fact illustrates the potentially severe problems with using the standard Whittle likelihood for parameter inference involving steep spectra, and motivates our use of the de-biased method.
After forming the tapered spectral estimate for each of the 512 turbulence velocity time series, we apply the de-biased Whittle likelihood to infer the best fit Matérn parameters for each time series. Here the frequency set F is chosen to include frequencies up to 1.5 radians per day, as this corresponds to the upper limit of apparent structure in the spectra. For each set of parameters, we generate a realization of a Matérn process having these properties as described in Section 5, and then cumulatively sum these velocity time series to produce the trajectories shown in Fig. 2b. Estimation of the spectra for these Matérn realizations in the same manner as for the turbulence data leads to the black dashed line shown in Fig. 3, which is seen to be a very close match to the velocity spectra for the particle trajectories from the turbulence simulation.
To generate the trajectories shown in Fig. 2c and Fig. 2d, we proceed as follows. The parameter values from the fit to the Matérn form are converted to a diffusivity through κ = 1 4 σ 2 /(λc α ), which is then used to scale realizations of white noise. The spectra of the associated velocities in Fig. 3c are seen as matching the low-frequency values of the Lagrangian velocity spectra from our turbulence simulation. Cumulatively summing these white noise velocities produces the trajectories in Fig. 2c; note that these trajectories therefore consist of discrete samples of standard Brownian motion. These are seen to match well the dispersion characteristics of the turbulence trajectories, but to have far too high a degree of small-scale roughness.
For the power-law realizations, we cannot employ fractional Brownian motion because the observed slopes-which in this simulation is steeper than those found in the oceanare outside the fBm range. Instead we use the implied spectral amplitudes A 2 = σ 2 λ 2α−1 /c α and slope parameters α from the Matérn fit to fix the properties of a different Matérn process having a very small damping value, chosen as λ = 2π/T where T is the record duration. Realizations are then generated and cumulatively summed to give the trajectories shown in Fig. 2d. As mentioned before, these have vastly too much energy on account of extending the high-frequency slope to very low frequencies. The flattening of the estimated spectrum for these realizations seen in Fig. 3 is a result of the extreme dynamic range hitting the limit of numerical precision.
The point of the application is to show that Matérn process provides an excellent match to the turbulence data. This opens the door to investigating a number of interesting physical questions regarding the distributions and interpretations of those parameters, which must, however, be left to the future.

Discussion
This paper has examined the Matérn process as a stochastic model for time series, which we have shown to be equivalent to damped fractional Brownian motion (fBm). The damping is shown to be essential for permitting the phenomenon of diffusivity to arise in the temporal integral of the process, referred to here as the trajectory, which disperses from its initial location at a constant rate. The rate of diffusion of the trajectory is given by the value of the spectrum of the process at zero frequency. At higher frequencies, the spectrum transitions to a power-law slope, like fBm, with the location of this transition being controlled by the damping parameter.
Because damping is a common feature in physical systems, the Matérn process is expected to be valuable in describing time series which, when observed over shorter time intervals, appear to consists of fractional Brownian motion. The addition of a spin parameter leads to a still more general process that satisfies the stochastic integral equation for a damped fractional oscillator forced by continuous-time white noise, and that encompasses the standard Matérn process as well as the complex (Jeffreys, 1942;Arató et al., 1999) and standard (Uhlenbeck and Ornstein, 1930) Ornstein-Uhlenbeck processes within a single larger family. A simple algorithm for generating approximate realizations of this 'oscillatory Matérn' process in O(N log N ) operations was presented.
A categorization of stochastic processes as diffusive, subdiffusive, and superdiffusive was proposed, depending upon their value at zero frequency. These categorizations refer to the nature of the dispersion experienced by the trajectory associated with the process, assuming that the integral of the process is well defined. This categorization is related to, yet distinct from, the conventional designation of a random process as short-memory or long-memory (Beran, 1994). We have argued that the diffusivity categorization may prove to be a powerful way to describe stochastic processes in general.
The Matérn process was found to provide an excellent match to velocity time series from particle trajectories in forced/dissipative two-dimensional fluid turbulence that are not directly influenced by the presence of vortices. This is an important contribution, since we show that a power-law process such as fBm cannot hope to capture the diffusive behavior. Despite its simple three-parameter form, trajectories associated with the Matérn process were seen to be visually virtually indistinguishable from those from the numerical model. This suggests that the Matérn form may prove useful for describing similar trajectories taken by instruments tracking the actual ocean currents. Such 'Lagrangian data' is one of the main windows into observing the ocean circulation, yet surprisingly little work has been done to analyze the velocity spectra in major Lagrangian datasets (Rupolo et al., 1996;Elipot and Lumpkin, 2008). Apart from Rupolo et al. (1996), the spectral slope in oceanographic Lagrangian data is almost completely unexplored, although it is implicit in several fractal dimension studies (Osborne et al., 1989;Sanderson et al., 1990;Sanderson and Booth, 1991;Summers, 2002).
In this paper, we have taken essentially an observational approach, and sought to fit a parametric model to the trajectories as a descriptive analysis, without requiring a physical justification. A next step is to attempt to understand this model on physical grounds. A number of researchers have attempted to derive forms for the Lagrangian velocity spectrum (or, equivalently, the autocovariance function) under simplified dynamical assumptions (Griffa, 1996;Weiss et al., 1998;Majda and Kramer, 1999;Berloff and McWilliams, 2002;Veneziani et al., 2005a;Majda and Gershgorin, 2013). One promising avenue of comparison is with the work of Berloff and McWilliams (2002), who derive dynamical models roughly equivalent to integral orders of the Matérn process. Another is with Majda and Kramer (1999), see their Section 3.1.2, who construct idealized velocity fields that give rise to the diffusive, subdiffusive, and superdiffusive regimes of Lagrangian behavior. Exploring the relationship of the Matérn form to these dynamical models is a promising direction for future research.
fast generation method described in Section 5; maternfit, which performs a parametric spectral fit for the Matérn process and a number of variations, using the de-biased Whittle likelihood method discussed in Section 6.2; and blurspec, which accounts for the blurring and/or aliasing of the theoretical spectrum associated with truncation of a continuous random process or the tapering of a finite sample. All functions support the oscillatory Matérn process as well as the standard Matérn process. Finally, makefigs_matern generates all figures in this paper based on model output that can be downloaded from http://www.jmlilly.net/ftp/pub/materndata.zip.

Appendix B: Diffusivity in terms of the spectrum
Here we show that for a second-order stationary process, the diffusivity κ is the value of the spectrum at zero frequency, as stated in (9). This is done by beginning with the nonstationary case. The time-dependent diffusivity κ(t) of a nonstationary process can be expressed in terms of the nonstationary autocovariance function R zz (t, τ ) as after substituting (4) into (5) and making use of (1). Applying the Leibniz rule for differentiation of an integral, in the form the expression for the time-dependent diffusivity simplifies to where in applying (B3), f (τ, t) is taken to be the entire quantity in square brackets in (B2). The second line in (B4) follows from the symmetry R zz (t, τ ) = R * zz (t + τ, −τ ), with {·} denoting the real part. The time-dependent diffusivity can be understood in several different ways, see also LaCasce (2008). Substituting the definition of the autocovariance (1), the last expression in (B4) becomes which states that the time-dependent diffusivity κ(t) is the integral of the covariance between the velocity at time t and the velocity at all times between 0 and t. However, z * (t) can be pulled outside the integral, leading to so that κ(t) can equivalently be seen as the inner product of the velocity at time t and the displacement at time t.
In the case that z(t) is stationary, R zz (t, τ ) = R zz (τ ), and the long-time limiting diffusivity value κ is given by after a change of variables. One may invert the inverse Fourier transform (3) to give S zz (ω) = ∞ −∞ e −iωτ R zz (τ ) dτ , and we then see that κ = S zz (0)/4, as claimed in (9). Thus, while diffusivity is generally thought of as a time-domain quantity, it may also be expressed in the frequency domain.

Appendix C: Diffusiveness and memory
In this appendix we examine the relationship between the properties of memory and diffusiveness, by constructing examples of processes with different combinations of these two properties through modifying the Matérn process. Here we will make use of a number of quantities that are not defined until the Matérn process is examined in Section 4.
Spectra of stationary processes corresponding to different combinations of memory and diffusiveness are given in Table C1. These processes can be generated through the stochastic integral equation (60), and are most simply described by specifying modifications to the transfer function G(ω) defined in (62), with attendant changes for its Fourier transform, the time-domain Green's function g(t). As discussed in Section 2.3, the classification of a process as 'diffusive' means that its spectrum takes on a finite nonzero value at zero frequency, such that the integrated version of the process exhibits diffusive dispersion, with the expected squared distance from an initial location increasing at a constant rate.
Multiplying the Matérn transfer function given by (62), with the spin Ω set to zero, by ω multiplies the spectrum by ω 2 and thus leads to a short-memory subdiffusive process, with a spectrum shown at the lower left of Table C1. This process has finite variance provided we choose α > 3/2. Table C1. Examples of spectra for short-and long-memory processes of subdiffusive, diffusive, and superdiffusive types. The term in the box is the spectrum of the Matérn process, as given in (39). Note that the two spectra corresponding to diffusive processes have been normalized such that κ = Szz(0)/4 = 1. λ is a nonnegative constant, while Ω is here a nonzero constant of either sign. κ
These results show that diffusiveness and memory, while related, are distinct properties that can be varied independently. In this table we have also noted the parameter ranges required for the process spectrum to integrate to a finite variance, and therefore for the process to be stationarity. In general for a spectrum of the form |ω| −2α , the behavior of the singularity at zero contributes to unbounded variance for α > 1 2 , while the behavior at large frequencies contributes to unbounded variance for α < 1 2 . Ensuring that neither the singularities nor the large-frequency decay will contribute to unbounded variance leads to the parameter ranges for stationarity shown in the table.

Appendix D: The fBm Rihaczek distribution
Here we derive (25) for the Rihaczek distribution of fractional Brownian motion, an expression that was previously presented by Øigård et al. (2006), adding some additional details. For fBm, there arises a complication in defining the Rihaczek distribution as in (21), because the integral in (24) is divergent. Despite this, (25) may be derived by interpreting this integral in a limiting sense, as is now shown. For α > 1/2, consider the integral which is an example of what is termed an Abel limit, see Wong (1980, p. 407). Thus interpreting (D1) as an Abel limit leads to 1 2 cos (πα) Γ(2α) such that a decaying power law in the frequency domain is associated with a growing power law, of one lower order, in the time domain. Here we have noted that changing the sign of ω in (D2) is equivalent to a complex conjugation, since (−1) 2α = e 2iπα , this leading to the absolute value of ω. The coefficient of the integral in (D3) simplifies to −V α /2, as shown in Appendix E. One then finds which shows that A 2 /|ω| 2α is the Fourier transform, in the Abel limit sense, of that part of the nonstationary autocovariance function R fBm zz (t, τ ) depending only on τ . The Fourier transformed quantity on the left-hand side of (D4) is also recognized from (20) as the negative of the fBm variogram γ fBm zz (τ ). A change of variables gives − ∞ −∞ V α 2 A 2 |t + τ | 2α−1 e −iωτ dτ = e iωt A 2 |ω| 2α (D5) and substituting (D4) and (D5) into (24), and making use of ∞ −∞ e −iωτ dτ = 2πδ(ω), one obtains (25). From left to right in (25), we have the inverse Fourier transforms of the |τ | term, the |t + τ | term, and the |t| term from the fBm autocovariance function (13).
the two increments to differ. Our expression differs from that in Mandelbrot and Van Ness (1968) because we have chosen to apply the similarity scaling to remove the increment duration rather than the separation, for reasons to become apparently shortly; "T " in Mandelbrot and Van Ness (1968) refers to what we call τ here.
The normalized fGn covariance function R fGn zz,1 (τ )/(A 2 V α ) is shown in Fig. F1. Because fGn will generally be sampled, we are typically interested only in time offsets τ that exceed the sample interval ∆, corresponding toτ > 1. Analyzing R fGn zz,1 (τ )/(A 2 V α ) using (F4), one sees that forτ > 1 it obtains a maximum value of unity at α = 3/2, while it vanishes both for α = 1/2 and α = 1. It is found that R fGn zz,1 (τ ) is positive for α > 1, and negative for α < 1, see Mandelbrot and Van Ness (1968). The maximum positive value is at α = 3/2 for allτ , but the maximum negative value occurs at some intermediate value of α in the range (1/2, 1). For any fixed α, increasingτ leads to absolute values of R fGn zz,1 (τ ) that decay toward zero. The behavior of the fractional Gaussian noise covariance function allows us to discuss the property of persistence. For α > 1, fGn exhibits positive correlations, such that positive values will tend to be followed by positives value and negative values by negative values. However, for α < 1, fGn is anti-persistent, and positive values will tend to be followed by negative values and vice-versa. Note that R fGn zz,1 (τ ) is not symmetric about α = 1: the most positive correlations occur at α = 3/2, but the most negative correlations do not occur at α = 1/2. This may perhaps be seen as reflecting a difference between persistence and anti-persistence. Values of the same sign can follow one another indefinitely, for any timescale; but the same cannot be true for values of the opposite sign.
The persistence transition in fractional Gaussian noise at α = 1 is reflected in the behavior of fractional Brownian motion seen in Fig. 5. Values of α > 1 coincide with the tendency for the process to systematically drift away from an initial value, as differenced versions of the process will tend to keep contributing perturbations of one particular sign. Similarly, for α < 1, the anti-correlations of the differenced process tend to act to restore fBm toward a baseline, and therefore these process are more closely distributed around the mean value of zero. The important point is that for fractional Brownian motion, the spectral slope can be seen as being linked to the degree of persistence or anti-persistence associated with a differenced version of the process.
Appendix G: The Matérn autocovariance for small τ In this appendix we derive the form of the small-τ behavior of the Matérn autocovariance function, as was apparently first done by Goff and Jordan (1988), their p. 13,606. Here we follow those authors, paying particularly close attention to the α range over which the result is valid. For this we will make use of the identity 9.6.2 of Abramowitz and Stegun