Brief Communication: Breeding vectors in the phase space reconstructed from time series data

. Bred vectors characterize the nonlinear instability of dynamical systems and so far have been computed only for systems with known evolution equations. In this article, bred vectors are computed from a single time series data using time-delay embedding, with a new technique, nearest-neighbor breeding. Since the dynamical properties of the standard and nearest-neighbor breeding are shown to be similar, this provides a new and novel way to model and predict sudden transitions in systems represented by time series data alone.


Introduction
Prediction of sudden regime changes in the evolution of dynamical systems is a challenging problem. For systems with known dynamical models, such as the Earth's atmosphere, simulated trajectories under judiciously chosen, finite-size perturbations can provide useful information regarding regime changes by detecting fast-growing instabilities along the model representation of the system evolution, called the "control". Breeding is a technique to generate an ensemble of such perturbations, developed for operational ensemble forecasting of the numerical weather prediction Kalnay, 1993, 1997;Kalnay, 2003), and the resulting perturbations are called "bred vectors". Evans et al. (2004) demonstrated that the growth rate of the bred vectors could be used as a means of predicting the regime changes in the chaotic Lorenz (1963) system (Lorenz, 1963). They found that the appearance of high growth rate typically indicated that a regime change would occur upon completion of the current orbit and that the longer the duration of the high growth rate, the longer the next regime would last.
Models of most natural systems like the Earth's atmosphere are described by a very large number of dynamical variables and thus are high dimensional. However, the variables or degrees of freedom are nonlinearly coupled, and consequently in dissipative systems the dimensionality of the phase space is significantly reduced. This is the basis for the time-delay embedding method in the reconstruction of phase space (Packard et al., 1980;Takens, 1981). The ability of this method to yield the dynamics inherent in observational data, independent of modeling assumptions, has stimulated the modeling of dynamics using time series data and has led to the development of forecasting tools. For example, reconstruction of the dynamics of the Earth's magnetosphere using time series data has led to low-dimensional models and forecasts of space weather (Sharma et al., 1993;Sharma, 1995). The data-derived models of magnetospheric substorms and geospace storms (Vassiliadis et al., 1995;Ukhorskiy et al., 2002;Valdivia et al., 1996) now provide near-real-time forecasts using the solar wind data monitored by the Advanced Composition Explorer (ACE) spacecraft at the first Lagrange point (L1). This paper presents a novel extension of the original breeding technique to the phase space reconstructed from time series data using the time-delay embedding method. Because dynamic instabilities are intrinsically low dimensional (Patil et al., 2001), such an extension is an appealing approach for the systems known to exhibit sudden regime changes in their data. The predictive capabilities of this new breeding technique are tested using data taken from the chaotic Lorenz system (Lorenz, 1963).

Phase space reconstruction
To extend the breeding technique to a system represented by a time series x(t) at discrete times t = t 1 , t 2 , . . ., t N , we first give a brief introduction of the time-delay embedding and set the notation used in this study. The state of the system at t i in the reconstructed phase space is defined by an m component vector: is the time-delay coordinate for k = 1, . . ., m; τ is the delay time; and t is the temporal resolution of the time series. The vectors x i specify the dynamical behavior inherent in the data in the form of a trajectory. The time-delay embedding (Sauer et al., 1991;Abarbanel et al., 1993;Kantz and Schreiber, 1997) requires only two parameters, viz., the dimension m and time delay τ . Among the several techniques available for obtaining these parameters from the data, the convergence of the correlation among the trajectories (Sauer et al., 1991;Abarbanel et al., 1993;Kantz and Schreiber, 1997) and the minimum in the mean-square prediction error (Chen and Sharma, 2006) yield the optimum values of m. The time delay τ also depends on the correlations, and the minimum of the mutual information function yields reliable values.

Bred vectors in reconstructed phase space
Having defined the reconstructed phase space by the timedelayed embedding, the new approach to breeding is in essence a matter of selecting the perturbed trajectories that capture the unstable directions along the control. Over a breeding cycle with window size n, the control starting from x i evolves to x i+n , and its neighbor (perturbed) x j to x j +n . The corresponding growth rate of the bred vector is given by where δx 0 i = x j − x i and δx f i = x j +n − x i+n are the initial and final perturbations of the breeding cycle, respectively. To select the perturbed trajectory of the next breeding cycle around the control starting from x i+n , we follow the spirit of the standard breeding in which the initial perturbation is given by δx 0 i.e., the final perturbation of the previous cycle is rescaled and bred as the new perturbation. Here the perturbation size α is constant for all breeding cycles. In the reconstructed phase space, however, the trajectory is defined by discrete points, and the rescaled position x i+n + αδx f i /||δx f i || may not be a trajectory point. We thus search and select the nearest point x j * and refer to the distance between these two points as the displacement distance. In order to avoid selecting nearest neighbors that are on the control trajectory, points x i+j ±l immediately adjacent to x i+n for small l are excluded from the search for x j * so as to ensure that δx 0 i+n = x j * − x i+n captures the instability around the control. We call this technique the "nearestneighbor breeding". Like the standard breeding, it involves two parameters, viz., the window size n and the target perturbation size α. For successful applications of the nearestneighbor search, the density of the trajectory points must be high enough that, on average, the displacement distance in the nearest-neighbor search is small with respect to target initial perturbation size α and the correlation between δx f i and δx 0 i+n is nearly 1.

Results and discussion
To test whether the nearest-neighbor breeding shares with the standard breeding the ability to predict regime changes in the reconstructed phase space, we use the 3-D Lorenz (1963) system (Lorenz, 1963) and generate a time series data set. Along with its simplicity, this system possesses dynamical properties desirable for this study, namely, strong nonlinearity that manifests in a low-dimensional attractor and chaotic transitions between two regimes (Fig. 1). The model equations are given by with the commonly used parameter values r = 28, b = 8/3, and σ = 10. Forward integration of the model is performed using a fourth-order Runge-Kutta with time step 0.01. To reconstruct the phase space from a single time series x(t i ) with the temporal resolution t = 0.01, we use the embedding dimension m = 3 and time delay τ = 7, which correspond to the timescale of the mutual information function (Sauer et al., 1991;Abarbanel et al., 1993;Kantz and Schreiber, 1997;Chen and Sharma, 2006). The reconstructed system is a discrete set of trajectory points that exhibits the dynamical features of the attractor as shown in Fig. 1c.
The regime transitions are analyzed by performing and comparing the following three breeding experiments. Ex- Figure 1. The first column of figures depicts the growth rates of bred vectors in the Lorenz system using three different methods: (a) standard breeding in the phase space (x, y, z); (b) nearest-neighbor breeding in the phase space (x, y, z); and (c) nearest-neighbor breeding in the reconstructed phase space (x 1 , x 2 , x 3 ). The colored points correspond to negative (blue), low (green), medium (yellow), and high (red) growth. The second column of figures depicts the first coordinate of phase space as a function of time, with red stars indicating the points with high-growth-rate (g i ≥ 6.4) bred vectors for the three different methods: (d) standard breeding in the phase space (x, y, z); (e) nearestneighbor breeding in the phase space (x, y, z); and (f) nearest-neighbor breeding in the reconstructed phase space (x 1 , x 2 , x 3 ). periment (a) is the standard breeding in the 3-D model phase space using Eq. (3) as in Evans et al. (2004). Experiment (b) is the nearest-neighbor breeding applied to a discrete time series data in the model phase space, i.e., In all experiments, the growth rate g i is computed using the breeding window size n = 8 with t = 0.01 and (tar-geted) perturbation size α = 0.10 over 10 000 total breeding cycles after an initial spin-up to make sure that the trajectory has reached the attractor. To ensure sufficient data density for the nearest-neighbor search in Experiments (b) and (c), the respective data sets are constructed from 80 000 data points in the original phase space. By excluding l = 3 adjacent points on the control trajectory from the nearestneighbor search, the average displacement distance and vector correlation between the standard and the nearest-neighbor breeding are 0.17 and 0.97 in Experiment (b); they are 0.12 and 0.98 in Experiment (c).
The left column of Fig. 1 shows the growth rates along the respective controls in the three experiments. The magni-E. Lynch: Breeding vectors in the phase space reconstructed from time series data tudes of the growth rate are represented by colors using the same empirical thresholds as in Evans et al. (2004): negative growth points (g i < 0) in blue, low-growth points (0 ≤ g i < 3.2) in green, medium-growth points (3.2 ≤ g i < 6.4) in yellow, and high-growth points (g i ≥ 6.4) in red. As shown in Evans et al. (2004) for the standard breeding, all experiments show high growth at points concentrated in the regime transition region, while the regions with different growth rates are well separated. The nearest-neighbor breeding, in the original (Fig. 1b) and reconstructed (Fig. 1c) phase spaces, successfully captures the features found in the standard breeding (Fig. 1a), although the separation between the different growth rates is less sharp. The right column of Fig. 1 shows the time series of the first phase space coordinate (x) for the first 500 breeding cycles for each of the three experiments. Note that, by construction, the first coordinate x in phase space coincides with the first coordinate x 1 in the embedded space.
As pointed out by Evans et al. (2004) and apparent in the right column of Fig. 1, high bred vector growth rate, marked in red, is a very good predictor of regime change in the standard breeding in the Lorenz model. Thus we test the predictive capabilities of the bred vectors for the regime changes in terms of the binary (Yes-No) forecasts based on the rule suggested by Evans et al. (2004): the presence of a red star in the previous orbit renders the regime change (Yes), while the absence means the continuation of the current regime (No). We note that we slightly modified the empirical rule for the experiments using nearest-neighbor breeding, by excluding red stars when they occurred on orbits with extrema of x 1 whose absolute value was greater than 1, since they led to false alarms in the prediction of regime change. Table 1 is the contingency table (Wilks, 1995) of the forecast/event pairs for the three experiments, where individual forecasts (FCST) are made by the rule and the observed events (OBS) are based on the actual occurrence or nonoccurrence of the transition. Corresponding accuracy measures for these binary forecasts (Wilks, 1995) are shown in Tables 2 using the hit rate (HR), threat score (TS), and false alarm rate (FAR). It is apparent that the three experiments succeed in predicting with similar accuracy the change of regime. The HRs and TSs for the three methods are close, varying from 82 to 87, and 72 to 76 %, respectively. The FARs are about 6 % for the standard breeding but increase to 11 and 13 % when nearest-neighbor breeding is used in the original model phase space and in the reconstructed phase space, respectively.
We note that, in addition to large bred vector growth rate, two other methods have been also proposed to predict regime changes in the Lorenz three variable system. In his original paper (Lorenz, 1963), Lorenz pointed out that regime changes were associated with large values of the variable z. Yadav et al. (2005) showed that large absolute magnitudes of the x variable are also a good predictor. We have implemented the method used in Yadav et al. (2005) for the time Table 1. Contingency tables based on the rule that regime change will occur in the orbit following the appearance of high-growth-rate bred vectors using three different methods. In (b) and (c) using the nearest-neighbor breeding, high-growth-rate points in orbits with absolute values of extrema above 1 are excluded. OBS and FCST stand for observed and forecast, respectively; (a)-(c) are the same as in Fig. 1   series x(t) and got equally good results. However, the main objective of this paper is to determine whether bred vectors can predict stability from a single time series data, i.e., in the reconstructed phase space. The reasons for the choice of bred vectors as a predictor of fast growth in the dynamical system are two-fold. First, unlike the size of a particular variable, breeding can be tested in any dynamical model. Second, bred vector perturbations and their growth have a clear physical meaning in that they detect instabilities (Hoffman et al., 2009) and are akin to the leading local Lyapunov vector and their corresponding growth (Norwood et al., 2013). Thus while predictions based on threshold values of a single variable work well for the Lorenz model, bred vector growth rate may be suitable for making predictions in a broad range of dynamical systems. The bred vectors, by their ability to capture nonlinearity in instability growth, can characterize instability in dynamical systems in a more general manner than Lyapunov vectors and provide a way to obtain regime change in cases where the latter are computed from the time series data (Vassiliadis et al., 1991). For the binary forecasts, Evans et al. (2004) noted that the next regime tended to be longer-lasting when the duration of the high growth rate is longer using the standard breeding (Fig. 1d). This tendency is also found using the nearestneighbor breeding in the reconstructed phase space in Experiment (c). However, in Fig. 1f, the cases of high growth rate may be separated by slower growth rates (e.g., at time t ∼ 23) due to the time delay involved in the construction of the embedded time series.

Conclusions
The ability to predict regime change in a dynamical system using the time series data of just one of its many variables, demonstrated in this paper, has important implications. For most systems in nature and in laboratory, the time series observations of only a limited number of physical variables, often a single variable, are available. In many cases even the actual number of variables is not known. This paper presents and demonstrates that the nearest-neighbor breeding enables the prediction of regime change in systems for which regime change follows the appearance of instabilities, thus extending the predictive capability beyond the cases whose time evolution equations are known. Further, when regime change is associated with large changes in the dynamical states, this technique can lead to the prediction of large or extreme events in the cases where nonlinear dynamical predictions are made using time series data, e.g., in the Earth's magnetosphere and space weather (Chen and Sharma, 2006).