Nonlinear Processes in Geophysics Nonlinear chaotic model for predicting storm surges

Abstract. This paper addresses the use of the methods of nonlinear dynamics and chaos theory for building a predictive chaotic model from time series. The chaotic model predictions are made by the adaptive local models based on the dynamical neighbors found in the reconstructed phase space of the observables. We implemented the univariate and multivariate chaotic models with direct and multi-steps prediction techniques and optimized these models using an exhaustive search method. The built models were tested for predicting storm surge dynamics for different stormy conditions in the North Sea, and are compared to neural network models. The results show that the chaotic models can generally provide reliable and accurate short-term storm surge predictions.


Introduction
Storm surge is a meteorologically forced long wave motion which is pushed toward the shore. It is generated by a combination of meteorological forces of the wind friction and low air pressure due to a storm (Gonnert et al., 2001), and oscillates in the period range of a few minutes to a few days. In the ocean, local wind waves can add to the water level, and the storm surge can be amplified (or reduced) by interference with the strictly regular astronomical tides. Extreme coastal floods can be related to extreme storms, like cyclones or hurricanes which attack the open coast. In some coastal areas, such floods can be generated by unusual sequences of wind set-up and air pressure variations. In addition, wind driven waves can be superimposed on the storm tide. This rise in sea level can cause severe flooding in coastal areas, particularly when the storm tide coincides with the high tides (Battjes et al., 2002).
Correspondence to: M. Siek (m.siek@unesco-ihe.org) Astronomical tides generally have the large contribution to the ocean water level variations in open oceans and many well-exposed coasts. Traditionally, the analysis of water levels usually employs linear methods that decompose sea levels into tides and other (usually meteorological) components. The amplitudes and phases of the tidal constituents driven by the astronomical motion of the Earth, Moon and Sun (with known periods) can be estimated by using Fourier analysis, response analysis or linear regression methods. In particular, the weakly nonlinear shallow water waves can be characterized by the Korteweg-de Vries (KdV) equation (Korteweg and de Vries, 1895) which is an exact solvable partial differential equation. Zabusky and Kruskal (1965) found that the KdV equation can be obtained in the continuum limit of the Fermi-Pasta-Ulam Experiment (Fermi et al., 1955). They showed that the solitary wave solutions had behavior similar to the superposition principle, despite the fact that the waves themselves were highly nonlinear. In real applications, however, the water level dynamics in coastal and estuarial swallow-water areas, such as the Dutch coast, may differ significantly from the astronomical estimated constituents (superposition principle) -due to the nonlinear effects that include meteorological forcing, tidalcurrent interactions, tidal deformations due to the complex topography and river discharges (Prandle et al., 1978).
Coastal floods due to storm surges can be predicted with an accuracy that depends on the accuracy of the meteorological forecasts. An appropriate numerical weather model can predict the motion of atmospheric depression with a satisfactory accuracy in a range of several days. The wind and air surface pressure fields predicted by this model can be utilized as some driving forces of the sea motion in a shallow water model allowing for storm surge predictions. Some recent updates on the operational storm surge numerical model with data assimilation in the Netherlands have been studied by Verlaan et al. (2005).
Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union.
Nonetheless, the analyses of the risk of coastal floods are not straightforward, because an observed flood is not a single independent event in statistical terms. Rather, the flood is a consequence of a set of different determinants, like tides, wind and air pressure, or of a set of sequences of these factors. Both the mean sea level and the flood height will vary along the coast and the risk of coastal flood depends on emergency preparedness planning and the design of coastal facilities and structures, such as flood embankments. The ocean water level variations due to various determinants and their complex interactions show long-term persistence leading to the correlated extreme events (Alexandersson et al., 1998;Butler et al., 2007).
Complexity of the described phenomena prompts for adequate methods to describe them, and one of them is chaos theory. The most direct link between the concept of deterministic chaos and the real world is the analysis of data (time series) from real systems in terms of the theory of nonlinear dynamics (Tsonis, 1992;Donner and Barbosa, 2008). Note that this approach is, in fact, data-driven, since it is purely based on the analysis of the observation data. The initial nonlinear analyses of the ocean water levels at the Florida coast have been conducted by Frison et al. (1999). The earlier attempts of the use of chaotic model (CM) for storm surge predictions were done by Solomatine et al. (2000), Walton (2005) using univariate local models. Velickov (2004) extended the method using multivariate chaotic models and showed that it has reliable and accurate short-term predictions. This model has been further improved by applying dimensionality reduction in the phase space (Siek et al., 2008). This paper presents the use and implementation of the nonlinear dynamics and chaos theory for predicting storm surges. If compared to our earlier work, we advanced the procedure of building chaotic model by incorporating several new features: using Cao's method (Cao, 1997) for better estimation on dynamical invariants; implementing multistep iterative predictions, applying the neighbors distance cut-off to avoid inclusion of the false dynamical neighbors, adding water level variable into multi-variate chaotic models, finding the proper number of neighbors using sensitivity analysis for stormy and non-stormy conditions, and optimizing the chaotic model parameters (time delay and embedding dimension). Furthermore, we compare the prediction performances of the proposed chaotic model with other models, including artificial neural network (ANN) models.
The following sections present the basics of storm surge modeling, nonlinear dynamics and chaos theory, chaotic model prediction, case study, model results and conclusion.

Storm surge modeling
Storm surge modeling has advanced significantly over the past 30 years which turns out to be very essential to anticipate the occurrence of coastal flooding. Some advances on physically-based storm surge modeling have been reported by Bode and Hardy (1997), Battjes et al. (2002) and Verlaan et al. (2005). They include: refining computational grids, utilizing more accurate calibration of models with better data, using an improved numerical schemes and incorporating data assimilation technique into the model.
Primary links between the nonlinear dynamics and chaos theory, and the storm surge model can be described as follows. The basis of a physically-based storm surge model which is widely used is the Navier-Stokes shallow water equations, stating the physical laws of mass and momentum conservations (Dronkers, 1964). These equations are inherently nonlinear. The sensitive dependence on the initial and boundary conditions of the dynamical evolution of such systems, and the broadband and continuous power spectra are the indicators of deterministic chaos. A mathematical proof on the existence of chaotic behavior in Navier-Stokes equations and turbulence has been conducted by Li (2004). On other hand, Simonnet et al. (2009) analyzed the presence of bifurcations in ocean, atmospheric and climate models for understanding the variability of oceanic and atmospheric flows as well as the climate system. As models, chaos dynamical systems show rich and even surprising variety of dynamical structures and solutions. Most appealing for researchers and practitioners is the fact that the deterministic chaos provides a prominent explanation for irregular behavior and instabilities in dynamical systems (including storm surges), which are deterministic in nature.

Study area: the North Sea
The North Sea lies between Norway, Denmark, Germany, the Netherlands, Belgium, France and Great Britain. It links up with the Atlantic Ocean to the north and also the southwest, via the English Channel. The total surface area is approximately 750 000 km 2 and the total volume 94 000 km 3 . The North Sea has a dynamically active regime dominated by strong tides and frequent passages of mid-latitude synoptic weather systems (Droppert et al., 2001). The waters are mostly shallow (depth <150 m) in the region. As tides from the deep Atlantic Ocean enter the North West European shelf, they propagate around the coast in the form of long gravity waves. High tides occur approximately every twelve hours. The main tidal stream enters the North Sea along the Scottish coast. As a result, the level difference between high tide and low tide is not the same everywhere. The actual tidal difference depends not only on the positions of the Sun and the Moon, but also determined by the weather, and primarily by the wind and surface air pressure. North-westerly storms are notorious. The rise on any particular occasion depends on the direction, the force and the duration of the storm.
The extreme storm surges in the North Sea is much affected by the role of the North Atlantic Oscillation (NAO) variability (Woodworth et al., 2007). The North Atlantic atmospheric variability is mainly driven by the NAO with an NAO index defined by the difference between normalized sea-level pressures between the Azores High and Icelandic Low. The periods with large positive index correspond to strong westerly winds which often strikes the Dutch Coast.
For the Netherlands, the accurate forecasting of storm surges is very important because of the possible coastal floods since the large areas of the land lie below sea level. These areas are most densely populated and important for the economy. Since the disastrous storm of 1953, the dikes and storm surge barriers in the delta area and along the Dutch coast have been systematically improved. These dikes and barriers should be operated (i.e. open/close) properly on the right time to avoid the barrier breaking. For this purpose, the accurate and reliable predictive models for the ocean water level and surge are critically required. Another important reason for having accurate forecasts of sea levels is the needs of navigation: the vessels waiting to enter the Port of Rotterdam need these to ensure safe passage through the entrance channel since draft of some of them is close to the depth of the channel.

Data description
Water level, surge, atmospheric pressure and wind speed/direction data from seven coastal stations along the Dutch coast are monitored and provided by the North Sea Directorate (Directie Noordzee, DNZ). The water levels are sampled at 0.0167 Hz and averaged over period of 10 min. In the data we had at our disposal each time series begins on 1 January 1990 and is available until 31 March 1996, which results in 337 249 continuous samples in total for 10 min times series, and 54 768 for the averaged hourly times series. Due to the experienced limitations of the used software to handle extremely large data sets of 10-min data, hourly data is used for all further analysis and building the chaotic model. The surge data is obtained by subtracting the observed water level with tide (astronomical forces) based on harmonic analysis, formulated as: In this paper, we concentrate on predicting the surges at the Hoek van Holland (HVH) tidal station, which is located at the entrance of the Rotterdam harbour. The possible inclusion of the spatial information from neighboring stations is also investigated for building the multivariate chaotic model. We explored the information from Europlatform (EPF) and K13 because in practice the observations from these two stations often become reference for the expert judgement concerning the possible extreme storm surges at HVH location in relations to the forecasts produced by the Dutch operational storm surge model (DCSM/WAQUA).   Figure 1 shows the North Sea region and the locations of the main tidal stations. Geographically, the EPF station is closer to the HVH station than the K13 station. The storm surge moves from the English Channel (South) to the North striking the western part of Dutch Coast, hence the EPF location is a good position in open sea to measure the storm surges or water levels before they reach to the Port of Rotterdam (Hoek van Holland). This information is required for the preparation before a coastal flood occurs.
Relationship between the long-shore winds, surge, water level, and air pressure difference time series data at HVH location is presented in Fig. 2. Table 1 illustrates the statistical description of the data from the three tidal stations used in this work. In order to evaluate the model performance Table 2. Data separation for surge time series into training, cross-validation and testing data sets for non-stormy and stormy conditions. Cross-validation data sets are used to find the optimal parameters of chaotic model using exhaustive search method.

Date
Non for various conditions, the surge data is divided into crossvalidation (CV) and verification data sets for non-stormy and stormy conditions as listed in Table 2 and depicted on Figs. 3 and 4, respectively. Each of these data sets consists of training and testing data sets. The cross-validation data sets are utilized for finding the optimal parameters of chaotic model using exhaustive search method. After being optimized, the prediction performance of the chaotic model was investigated using verification data sets for various stormy conditions. The rest of observed time series (1 September 1995till 31 March 1996 is not used for model prediction.    This data separation was made based on the analysis of recurrence plot and visual inspection in time-domain time series. The recurrence plot is used to visualize the recurrences in a dynamical system, which it has capabilities to detect the presence of homogeneity, intermittency and transition in a time series (Marwan et al., 2007).

Dynamical system
A dynamical system can be defined as a set of rules or mathematical equations that describe the time evolution of the system states given some initial conditions or knowledge of its previous history. Some examples of dynamical system are the Navier-Stokes equations and Newton's equations for the motion of a particle with suitably specified forces. These dynamical systems can often be expressed by m-first order ordinary differential equations dx/dt = f(x(t)) or in discrete time t = n t by maps of the form x n+1 = f(x n ). This time evolution is defined in some phase space. Such nonlinear systems can exhibit deterministic chaos which comprises a class of a signal intermediate between regular sinusoidal or quasi-periodic motions and unpredictable or truly stochastic behavior (Lorenz, 1963). The main reason for applying chaos theory is the existence of methods permitting to predict the future positions of the system in the state space.

Phase space reconstruction: method of time-delay embedding
The most important phase space reconstruction technique is the method of time delays, which is known as Takens' embedding theorem (Takens, 1981). Vectors in a new space or embedding space are formed by the time delayed values of the scalar measurements. According to Takens' theorem, the dynamics of a time series can be fully embedded in the m-dimensional phase space defined by the delayed vectors: where τ is the delay time. The lowest possible dimension of such manifold is called an embedding dimension.

Finding appropriate time delay
In real applications, the delay time τ needs to be appropriately chosen in order to fully capture the structure of the attractor. This can be achieved by embedding the attractor in a smooth manifold. The straightforward choice of τ is usually made with the help of the zero-crossing autocorrelation function. However, in the terms of nonlinear methods, the choice of τ associated with the first minimum of the time delayed mutual information based on the Shanon's entropy (Fraser and Swinney, 1986) demonstrates good performance in reconstructing the system dynamics from the observables. period at Hoek van Holland location.  Figure 5 shows the autocorrelation and mutual information of the surges at HVH. The autocorrelation function has slow decay until time lag 18 and further oscillates in a small range r=[0.25, 0.35]. This implies the presence of longterm correlations in the analyzed time series data. The first minimum of the mutual information which characterizes the nonlinear relationships between time-lag variables is found to be a better criterion for the choice of optimal time delay than the zero-crossing autocorrelation (that measures linear dependency only). The first minimum of mutual information happens at delay τ = 10 h.

Self-similarity: dimension
Attractors of deterministic chaotic systems exhibit an unusual kind of self-similarity and show structure on all length scales, thus possessing non-integer or fractal dimensions. A proper embedding dimension has to be searched, such that the structure of the attractor becomes invariant. The most widely used fractal dimension quantifier is the correlation dimension d c , which is based on the correlation integral or function analysis (Grassberger and Procaccia, 1983). Correlation function C m (r) for the distance range of r available from the time series and several embedding dimensions m is inspected for the signatures of self-similarity by estimating the slope of logC(r) versus logr plot. If the time series describes the dynamics of an attractor, then for positive values of r, the correlation integral C(r) scales to the radius r by the power low C(r) ∼ = αr ν , where ν is called correlation exponent and α is a constant. For a random process, the correlation exponent ν varies linearly with the increasing of m, without reaching a saturation value. On the other hand, for deterministic process, the value of ν saturates and becomes independent for increasing  A large size of data set is commonly needed to compute the correlation dimension d c . However, there is no consistent agreement on how many data can sufficiently provide the accurate estimation of the correlation dimension. Some authors like Smith (1988), Theiler (1990) and Ruelle (1990) suggest differently on the minimum size of data set required for estimating correlation dimension. For correlation dimension 8.5, the size of data set used here is sufficient to estimate the correlation dimension (Ruelle (1990)). The size of 54 768 data points of the hourly surge time series from 1 January 1990 till 31 March 1996 is larger than the minimum data set size 10 d c /2 suggested by Ruelle (1990) which is about 17 783 data points. Please also note that the data set in this work was obtained from the real observations representing the physical processes in nature, and not merely on the basis of the uniform-random model. Nonetheless, we consent that the larger size of data sets might be needed for better estimation of correlation dimension. Figure 6 visualizes the power law scaling between the correlation integral/sum C(r) and the length scales r. The correlation integral was computed for different embedding dimensions (the line with squares corresponds to m = 2 and the line with star corresponds to m = 20). After embedding dimension m ≈ 12 − 14 lines become parallel and thus the slope (correlation exponent) saturates. There is no anomalous or wide fluctuation found in the slope of the scaling region in the correlation integral plot. Figure 7 shows at Hoek van Holland. The correlation integral was computed for different embedding dimensions (the l squares corresponds to embedding dimension 2 and the line with star corresponds to embedding dimens that the correlation exponent increases with an increase of the embedded dimension up to a certain value and further saturates. The saturation value of the correlation exponents/dimensions using the proper time delay of 10 h is 8.5. This indicates the presence of an attractor in the surge dynamics. Reconstruction of an attractor from a time series of observables should be embedded in the proper dimensionality of the manifold such that the structure of the attractor becomes invariant. According to Taken's embedding theorem (1981), if the dimension of the manifold containing the attractor is d c , then the embedding dimension is m 2d c + 1 to preserve the topological properties of the attractor in the phase space. This implies that the embedding dimension of 18 is required to unfold the attractor of the surge dynamics. On other hand, Kennel et al. (1992) recommends the minimum embedding dimension of m d c . This specifies that the embedding dimension of 8 or 9 is sufficient. These results, however, need to be verified by other embedding dimension estimators as described in the following sections.

False nearest neighbors
A method to determine the minimal sufficient embedding dimension m, so called the false nearest neighbor (FNN) method, was proposed by Kennel et al. (1992). The false neighbors are the points projected into neighborhoods of other points to which they do not belong as neighbors in higher dimensions. . Minimum embedding dimension estimated by Cao's method with τ = 10 and k = 1. the embedding dimension. This result is consistent with the estimation using correlation dimension based on the rule of Kennel et al. (1992) suggesting that the minimum embedding dimension is m d c .
The FNN algorithm has a drawback associated with the subjective choice of the threshold in order to ensure a correct distinction between low-dimensional chaotic data and noise. Different time series data may have different threshold values. These imply that it is very difficult and even impossible to give an appropriate and reasonable threshold value which is independent of the dimension m and each trajectory point, as well as the considered time series data. To avoid this, Cao proposed a modified algorithm, sometimes called the averaged false neighbors (AFN) method (Cao, 1997). Cao's approach is based on the estimation of two parameters E1 and E * which are basically derived from the quantities defined by the FNN method. Based on the construction of the time delay vectors from the time series x 1 , x 2 ,...,x N , an m-dimensional vector is defined by y i (m) = (x i ,x i+τ ,x i+2τ ,...,x i+(m−1)τ ), where i = 1,2,...,N − (m − 1)τ and τ is the time delay. Similarly to the FNN method, the AFN approach defines the quantity where . is the maximum norm, y i (m + 1) is the i-th reconstructed vector for embedding dimension m and n(i,m) is an integer such that the m-dimensional time-delay vector y n(i,m) (m) is the nearest neighbor of y i (m). Subsequently, the quantity of E1 is formulated as the mean value of all FNN distance a(i,m)  The E(m) depends only on the dimension m and the time delay τ . The variation from m to m+1 can be investigated by E1(m) = E(m+1)/E(m). The E1(m) stops changing when m is greater than some value m 0 if the time series comes from an attractor. Then m 0 + 1 is the minimum embedding dimension to be obtained. It is necessary to define another quantity E * which is useful to distinguish deterministic from stochastic time series, formulated as These quantities are computed for different, progressively increasing values of the embedding dimension m. Then the global behaviors of E1 and E * as functions of dimension m are respectively used to estimate the minimum embedding dimension and to determine the nature (stochastic vs. deterministic) of the underlying dynamical process generating the time series. Figure 9 shows the saturated line of E1 can be obtained starting from the dimension m ≈ 10 − 12 for surges at HVH. We set the number of neighbors k = 1. There is no existence of the straight lines of E * implying that the surge dynamics is driven by deterministic behaviors.

Lyapunov exponents
The Lyapunov exponents characterize the exponential instability or the average rates of divergence or convergence of nearby trajectories in phase space, and therefore, measure how predictable or unpredictable the dynamical system is. The spectrum of Lyapunov exponents (λ i ) can be computed based on the work of Sano and Sawada (1985). If at least one Lyapunov exponent is positive, then the dynamical system is characterized by deterministic chaos. If no positive Lyapunov exponent exists, then there is no exponential divergence, and thus the long-time predictability of the dynamical system is guaranteed. The Lyapunov spectrum for the surge time series data is presented in Fig. 10. The largest value of Lyapunov exponents is estimated to be λ 1 = 0.08 which indicates a loss of information of 0.08 bits/hour during the dynamical evolution of the system, and thus loss of predictive capabilities. The Lyapunov spectrum contains a large negative exponent λ 6 = −0.75 which indicates presence of strong dissipation mechanisms in the surge dynamics. The presence of positive Lyapunov exponents and the fact that sums of Lyapunov exponents are negative (−1.48), provide strong evidence that the surge dynamics in the North Sea is driven by deterministic chaos. Furthermore, Lyapunov dimension of a strange attractor can be approximated from the Lyapunov spectrum based on Kaplan-Yorkes conjecture (Kaplan and Yorke, 1979). The existence of a fractal Kaplan-Yorke dimension which is estimated to be 4.1 indicates deterministic chaos in the surge dynamics.
However, the existence of two positive Lyapunov exponents ( Fig. 10) denotes the hyperchaotic behavior of the attractor in the surge dynamics. The first hyperchaotic 4-D flow system was introduced by Rössler (1979). This kind of dynamical system requires at least a 4-D phase space to unfold the attractor. The reasons of the presence of hyperchaotic behavior in the surge dynamics can be due to some noise in the measurement and the inherently complex interactions between forces which induces storm surges. We could not find any reference about a reliable method for distinguishing between chaos and hyperchaos in a noisy time series, so this aspect would need further research.

Chaotic model prediction
The ultimate goal of constructing a chaotic model in this work is to use it for prediction, which in terms of phase space representation of dynamics means extrapolation of the trajectory by modeling the dynamical evolution of the system in time. Therefore, in this context, the concept of learning models from data, like a nonlinear regression is typically utilized to estimate the reconstructed trajectory in phase space. To model a deterministic chaotic system, one has to accurately reconstruct an m-dimensional phase space with time delay τ from univariate or multivariate time series of the observables. Since the time series data are discretely sampled over time, the underlying dynamics is described by a deterministic model in phase space, which is always a map of the following form: where Y t are the delayed vectors (states) of x t ,x t−τ ,x t−2τ ,...,x t−(m−1) τ formed by the time series of observables embedded into an m-dimensional phase space with a proper time delay τ = υ t (υ is time index in integer values). In order to predict the next state of a dynamical system, one needs to find an estimator of the regression function so that the prediction of x t+1 can be estimated bŷ After these general considerations, the next step is to find the proper approximation of the model, expressed through its structure, capacity and a criterion for the quality of the model which is to be learned from the data in the reconstructed phase space. Global and local models are two possibilities to consider when choosing the model structure for approximating the true mapping function (Casdagli, 1989). The basic idea of the local approximation methods is to use only the states close to present state in phase space in order to make predictions (Farmer and Sidorowich, 1987). Thus, such models have to learn neighborhood relations from the data and map them forward in time. In order to predict the value of the observable x t+T , which is part of the state vector Y t+T where T is some time horizon in the future, based on the state vectors Y t and past history embedded in the reconstructed phase space, k nearest neighbors of Y t are found on the basis of some norm Y t − Y t * , with t * < t (t is a discrete time step). Depending on the number of the neighbors considered, and the type of the local mapping chosen, several variations of the local approximation method are possible. The local zeroth, linear, quadratic and 3rd-order polynomial models have been implemented and used in this work. Based on the identified and reconstructed dynamics of the surges at HVH, an attempt was made to build an accurate predictive model utilizing local modeling approach and the notion of dynamical neighbors as depicted in Fig. 11.
In addition, a multi-step iterative prediction method was developed and utilized in this work. The multi-step prediction technique consists of making repeated one-step predictions up to the desired horizon. It predicts only onestep ahead using the estimate of the output of the current prediction as the input to the prediction of the next time Phase space reconstruction and the description of searching dynamical neighbors and their dynamical evolution in the past allowing for predicting the future evolution of the dynamics using local modeling. In this example, the real sea level data reconstructed in 3-D phase space with τ = 3 h is utilized. Three data points ("star and diamond" markings) in time-domain time series are represented as a single point in phase space. Prediction is made by searching the dynamical neighbors (triangle and box markings) of the current point (black circle marking) in phase space and extrapolating the future state by using a local predictive model constructed based on dynamical neighbors. step until the prediction k-steps ahead is made. The multistep prediction technique demonstrates better prediction performance than the direct prediction method (Box et al., 1994;Kugiumtzis et al., 1998). One of the benefits of using the multi-step prediction is that the dynamical neighbors can be selected iteratively for each one-step prediction. Thus, in most cases, this procedure is able to avoid taking the false neighbors which may produce larger deviations of the neighbor trajectory projections into the future states. Figure 12 illustrates a comparison between direct and multi-step predictions for the surge dynamics in mdimensional phase space. In this example, we notice beforehand from the observed data that the surge in the next 2 h would raise up. Suppose trajectory b is the one to be predicted for 2-steps ahead and trajectories a, c and d are the neighbor candidates of trajectory b. The k-NN procedure used for finding the neighbors is executed once in direct prediction and h-times (h is the prediction horizon) in multi-step prediction techniques. The trajectory a is a true neighbor and being chosen by both k-NN procedures. On other hand, the trajectory c is a false neighbor which is actually close to trajectory b and selected in the first k-NN procedure, but not in the second k-NN procedure. The trajectory d is the reverse case of trajectory c. Hence, the projection of trajectory b into 2-steps (hours) ahead using direct prediction method produces incorrect prediction (predicting the decreasing surge). This happens due to the inclusion of false neighbor c which subsequently results in  building a "false" local model. In contrast, the multi-step prediction is able to predict the increasing surge correctly because the false neighbor c can be avoided (not selected) in the second k-NN procedure.

Neural networks
Backpropagation multi-layer perceptron (MLP) with Levenberg-Marquardt training rule (Haykin, 2008) was utilized and trained using the same input structure as the chaotic model inputs. The number of hidden neurons of ANN was selected using the exhaustive search in the range [1∼10]. The optimal MLPs structures are listed in Table 3 for univariate NN and Table 4 for multivariate NN.

Univariate chaotic model
The nonlinear analysis of the surge time series as described in Sect. 3 recommends the appropriate values of time delay and embedding dimension to be τ = 10 and m = 8 as identified by the analysis of correlation dimension and false nearest neighbors, and m ≈ 10 − 12 by the Cao's method. These proper values of τ and m obtained by the nonlinear analysis becomes a reference for the procedure to exhaustively search for the optimal values of time delay and embedding dimension for predictions. The reason of using exhaustive search optimization is that the objective of building a chaotic model is not only to identify and characterize the chaotic behavior of the surge dynamics, but also to predict the future states of surges using local modeling. Furthermore, the accuracy of some existing estimators for determining the proper values of τ and m is argued by some researchers (Cao, 1997;Schreiber, 1999;Hegger et al., 1999). The discrete time series used in this work, which is obtained from the real observations in nature, may not be perfect (incomplete and noisy). This results in imprecise estimations of the τ and m and becomes subjective. The other reason is that most of the methods in nonlinear time series analysis originate from the analysis of the continuous dynamical system described by the differential equations with some assumptions and simplifications. In contrast, the observed time series is obtained from the real natural phenomena and may contain the information of more complex system and physical interactions. In addition, the presence of hyperchaotic behavior in the surge dynamics requests more care in determining the proper values of τ and m. Therefore, we utilized an exhaustive search optimization technique to find the optimal values of τ and m based on the predictive performance of the chaotic model. The other important parameter for building a chaotic model is the number of neighbors (k). Sensitivity analysis was performed to find the proper k values for non-stormy and stormy conditions. The sensitivity analysis was performed by setting up the chaotic model parameters for the surges (with fixed τ = 10 and m = 8) and k run from 1 to 2000. Subsequently, the exhaustive search optimization was executed using the optimal k-value for finding the optimal values of τ and m. The 3rd-order polynomial local models were built based on the dynamical neighbors. This model shows better predictive performance for the local model in comparison to the zeroth, linear and quadratic models. Additionally, we use also filtered out the neighbors that are far (in Euclidean sense) from the current point -treating them as false neighbors. The procedure is as follows: 1. Define the number of neigbhours (k).
2. Find the k-nearest neigbhours of the current state in the phase space.
3. Remove the discovered neighbors if these neighbors have distance to the current state larger than twice the distance for the 1-nearest neighbor. Figure 13 depicts the six-hours prediction RMS errors of the chaotic models as a function of the number of neighbors (k) for non-stormy and stormy conditions. It is seen that the suitable number of neighbors for predicting surges for stormy condition is a small value of 13 neighbors and this value should be smaller than the one (80 neighbors) for nonstormy condition. This is due to the fact that less appropriate dynamical neighbors (representing similar surge behavior in the past) can be found especially during extreme storms. If more neighbors are considered, the model performance becomes low due to the inclusion of false neighbors in constructing local models.
The exhaustive search optimization was performed with the following settings: time delay in a range of [1 ∼ 24], embedding dimension in the range of [2 ∼ 30], the 3 rdorder polynomial local model and the number of neighbors k = 13 (for stormy conditions) and k = 80 (for non-stormy conditions). The prediction horizons are 1, 3, 6, 10, and 12 h.   values of time delay and embedding dimension. The optimization result is the most accurate chaotic model which has the lowest RMS error on cross-validation data set. The cross-validation data sets have a data set of 400 data points (see Table 2): 19 January 1994 03:00 till 4 February 1994 19:00 (time indices of 35500-35900) for stormy condition and 11 May 1994 15:00 till 28 May 1994 07:00 (time indices of 38200-38600) for non-stormy condition. This small size of cross-validation data sets was employed with considerations of the intensive computation required for the exhaustive search optimization. Figure 14 shows the RMS errors of the univariate chaotic model for 1 and 10 h prediction horizons during stormy period as a function of time delay and embedding dimension for surges (τ = 10, m = 8) at HVH location. These plots are different for each prediction horizon and for stormy and non-stormy conditions. This denotes that the proper reconstruction of a phase space from a time series does not only depends on the choice of time delay and embedding dimension, but also on the prediction horizons.
For example, the optimal time delay and embedding dimension for univariate chaotic model for predicting 6 h ahead surges stormy period was obtained: Y hvh t+6 = sur hvh t ,sur hvh t−3 ,sur hvh t−6 ,sur hvh t−9 ,sur hvh t−12 ,sur hvh where Y hvh t+6 is the phase space structure for predicting surges 6 h ahead at HVH location, sur hvh t and sur hvh t−3 is the surges at time t and t − 3, respectively. The complete optimal univariate chaotic model structures are listed in Table 3.

Multivariate chaotic model
Multivariate chaotic models incorporating the information about surges at HVH and neighboring stations (EPF and K13), change in air pressure and wind components were employed, with the main objective to improve the prediction accuracy for longer prediction horizons. The relationship between surges at HVH and EPF/K13 are measured by the cross-correlation and mutual information as shown in Fig. 15. Both methods specify that the EPF surge precedes the surge at HVH about 1 h and the K13 surge has less relationship with HVH surge and the HVH surge would reach to K13 around 1-1.5 h later. Similarly, it is indicated by mutual information. The cross-correlation and mutual information functions on the observed surge data show that the EPF surges have higher correlations with the HVH surges than the K13 surges. The data from K13 station was not utilized in the building of multivariate chaotic model because the inclusion of this information does not improve or even decrease the chaotic model performance. Our experience indicates that the use of less-correlated information into a chaotic model can produce a high-dimensional and more complex structure of the reconstructed phase space and this often destructs the smoothness of the trajectories and attractor resulting in low predictability of the model. Thus, we only include the information from EPF as inputs to the multivariate chaotic model.
The other variables that require more analysis are wind speed and direction. We applied the cross-correlation and mutual information for acquiring the principal wind component which has the largest influence to the surges at HVH location. The various wind directions from 0 to 180 degrees from north were investigated. The strongest influence of the winds to the observed surges at HVH location is generated by wind component 120 • angle from north with the correlation coefficient −0.65 (see Fig. 16). This component corresponds to the north-westerly wind which blows from the sea and is approximately perpendicular to the coastline (Fig. 1). Likewise, this relationship is indicated by the mutual information function (Fig. 16).
Multivariate phase space reconstruction of the surge dynamics using the hourly time series data was solved technically using a multivariate embedding. The exhaustive search optimization was employed to find the optimal values of τ and m for each variable. The general phase space structure for the surge dynamics at HVH location is defined as where Y sur,hvh t+T is the phase space structure for predicting surges with prediction horizon T at HVH location, sur hvh τ,m , wl hvh τ,m and dp hvh τ,m are the surge, water level and pressure difference at HVH location, wind hvh,120 • τ,m is the wind speed at HVH location with direction 120 • angle, and sur epf τ,m is the surges at Euro Platform station.
The next task is to find the optimal values of τ and m for each variable in Eq. (9) for building a multivariate chaotic model using the exhaustive search optimization. The parameter settings for the exhaustive search optimization were set as follows. The values of τ and m for sur hvh τ,m and wl hvh τ,m are set with the optimal values as obtained from the nonlinear analysis, sur hvh τ,m has τ = 10 and m = 8, whereas wl hvh τ,m has τ = 4 and m = 6. These optimal values for the important variables sur hvh τ,m and wl hvh τ,m were used as a basis for optimizing the other variables. This approach reduces the optimization parameter space and subsequently needs less computing time. The inclusion of all variables into the optimization space considerably increases computation time, but the optimal solution may not provide much better improvement of the model performance. The optimization settings for other variables (sur epf τ,m , wind hvh,120 • τ,m , dp hvh τ,m ) were set for the following value ranges: τ = [1 ∼ 5] and m = [2 ∼ 5]. We employed the 3rd-order polynomial approximation function as the local model and the number of neighbors of k = =13 and k = 80 for stormy and non-stormy conditions, respectively. The prediction horizons are 1, 3, 6, 10 and 12 h. The optimization outcome is the most accurate chaotic model which has the lowest RMS error on cross-validation data set. The optimal multivariate chaotic model structures are summarized in Table 4 for different prediction horizons and stormy and non-stormy conditions. For instance, the optimal multivariate phase space reconstruction for predicting surges with prediction horizon 3 h for stormy condition was ,dp hvh t ,dp hvh t−1 ,...,dp hvh t−3 where Y sur,hvh t+3 is the phase space structure for predicting surges 3 h ahead at HVH location.
For non-stormy condition, the optimization results indicate that the most appropriate time delay for the variable sur epf τ,m is 1 h for all prediction horizons (see Table 4). This coincides with the analysis as depicted in Fig. 15 showing that the EPF surges precede surges at HVH about 1-1.5 h. Table 3 summarizes the univariate chaotic model and ANN model prediction performances. The optimal parameters for chaotic models (τ , m, k) and neural network models (number of hidden neurons) were obtained based on the model performance (RMS error) on cross-validation data sets for non-stormy and stormy conditions. Subsequently, we verified these models on verification data set with various optimal values of τ , m, k and number of hidden neurons based on the prediction horizons and stormy and non-stormy conditions. These comparison results (up to 12 h prediction) show that both models have similar prediction accuracy, for both stormy and non-stormy surge conditions. However, the chaotic model is able to predict the extreme surges better than ANN. This is depicted in Fig. 17. The chaotic model errors are more dampened and stable (also during the surge peaks) than the neural network errors.

Model results
Nonetheless, the predictive chaotic model with local modeling may include some false neighbors due to the fact that the trajectories are very close to each other and the nearest neighbors found may have different or reverse directions of the trajectories. This is a practical problem occurring due to the use of nonlinear discrete time series from the observables and the use of integer (not fractal) values of τ and m in phase space reconstruction. In the continuous dynamical system derived by the differential equations many less false neighbors are to be found. The possible solutions for these issues are: to use smaller sampling time of the surge data (e.g. 10 min) for reducing the sharp oscillations and thus providing enough points for producing better local models to handle these oscillations; to implement a mixture of various local models (e.g. ANN) in the phase space which may perform better for predicting future trajectories of a particular condition or regime; to reconstruct the phase space from a time series using non-equidistance time delay method which can unfold the attractor better; and to select a larger size of cross-validation data set. A multivariate chaotic model (Table 4) with inclusion of various variables (wind, air pressure and neighboring station information) does not significantly improve the accuracy of predictions in comparison to a univariate chaotic model. This is in a way surprising and disappointing. We see the reason for that in the fact that adding more variables (observed non-smooth data) to the phase space distorts the smoothness of trajectories and attractor of the surge dynamics. This distortion may produce more false nearest neighbors resulting in less accurate predictions by local models.
For the location at Hoek van Holland, the overall 3 h ahead surge prediction errors (RMSE) on verification data sets during stormy period for univariate CM, univariate NN, multvariate CM and multivariate NN are 11.94,19.46,11.99 and 16.78 cm,respectively. In this respect, the multivariate chaotic model generally outperforms ANN models because it uses the other variables, spatial information and local models for predicting the future surges by identifying the past behavior of the surge dynamics.

Conclusions
Based on the nonlinear chaos analysis, the surge dynamics along the Dutch coast can be characterized as deterministic chaos. The presence of the chaotic dynamics together with the positive Lyapunov exponents implies that there are limits of predictability for any model. However, short-term reliable predictions are possible. The chaotic behavior occurs because the sea levels and surges are the result of a complex, coupled nonlinear dynamical system. If compared to our earlier work (Solomatine et al., 2000;Velickov, 2004;Siek et al., 2008, see Sect. 1) in the present paper we employed extended data analysis methods and improved algorithms for predictive models. A chaotic model with local modeling and multi-step prediction technique demonstrates a good predictive performance and can serve as an effective tool for accurate and reliable short-term storm surge predictions. Further improvements in accuracy are expected if more sophisticated methods to identify the adequate nearest neighbors are used, and if a data assimilation scheme is added to the presented method.