Predicting climate extremes – a complex network approach

Regional decadal predictions have emerged in the past few years as a research field with high application potential, especially for extremes like heat and drought periods. However, up to now the prediction skill of decadal hindcasts, as evaluated with standard methods is moderate, and for extreme values even rarely investigated. In this study, we use hindcast data from a regional climate model (CCLM) for 8 regions in Europe to construct time evolving climate networks and use the network correlation threshold (link strength) as a predictor for heat periods. We show that the skill of the network measure to predict the low frequency dynamics of heat periods is similar to the one of the standard approach, with the potential of being even better in some regions.


Introduction
Decadal prediction is a relatively new field in climate research.Skillful prediction of climate from years up to a decade would be beneficial for our society, economy and for a better adaption to a changing climate.Within the large international CMIP5 project (Coupled Model Intercomparison Project Phase 5, Taylor et al., 2012) global decadal predictions of climate key variables like temperature and precipitation have been performed with state-of-the-art Earth system models.In order to validate the prediction skill of the models so called hindcast experiments are conducted.That means, the models are initialized with observations e.g. in 1961 and then run freely for 10 years and stop at the end of 1970.In 1971, the models are again initialized and start to run for another 10 years and so on.More advanced approaches of initializing every year have been also followed.These hindcasts can be evaluated against observational data to quantify the prediction skill of the models depending on the lead time, which is the time range between the initialization and the forecast datum of interest.In recent years, several studies on decadal predictions have shown the potential of these initialized (global) model runs (e.g.Keenlyside et al., 2008;Müller et al., 2012;Matei et al., 2012;van Oldenborgh et al., 2012;Corti et al., 2012;Doblas-Reyes et al., 2013;García-Serrano et al., 2013;Smith et al., 2013;Meehl et al., 2014;Chikamoto et al., 2015).However most studies concentrate on regions like the Tropical Pacific or North Atlantic and on slowly evolving variables like sea-surface temperature.These regions receive their predictability from large scale processes like the AMOC (Atlantic Meridional Overturning Circulation) or PDO (Pacific Decadal Oscillation) and thus allow to extract predictable signals out of the noise.To be useful for society, and climate change adaption, regional climate predictions are required which should provide skillful forecasts on smaller regions, shorter periods, and include climate extreme events on populated land areas like the European continent.The European climate is more connected to short term processes like NAO (North Atlantic Oscillation), which is to a certain extent predictable on seasonal scales, whereas the decadal predictable signal is weak (Scaife et al., 2014), which has been shown also for temperature and precipitation in large projects like ENSEMBLES (MacLeod et al., 2012).Further, the complex orography with the Alps in the center contribute to a manifold of general weather situations and hence to a complex climate (e.g.CORDEX-EU, Jacob et al., 2013;Giorgi et al., 2009).Nevertheless, the European continent is influenced by the AMOC and thus this process may yield to a certain predictability, although the signal to noise ratio is most probably small.Up to now, the prediction skill for Europe is weaker than for such regions as the South Pacific or North Atlantic.Mieruch et al. (2014) have used a regional decadal hindcast ensemble for Europe and detected moderate prediction skill for summer and winter temperature and summer precipitation anomalies in the order of five years.On the other hand an innovative approach in climate research has been established in the recent years, namely the complex climate network approach.
The general idea of climate networks is to consider climate time series e.g. at the grid points of a climate model as nodes of the network and the statistical connection between the time series as links of the network.A link between two arbitrary time series (geolocations) exists, if the correlation measure between the time series exceeds a certain threshold.
The climate network community has been very active in recent years.Tsonis et al. (2007) proposed "A new dynamical mechanism for major climate shifts" and explained e.g.decadal shifts in global mean temperature (Tsonis and Swanson, 2012).Radebach et al. (2013) discriminate different El Niño types using the network approach, Ludescher et al. (2013) developed a network method to improve El Niño forecasting and Donges et al. (2011) revealed a connection between (paleo-) climate variability and human evolution using recurrence-networks, which are similar to the complex climate networks.Generally, it has been shown that climate networks contain useful information for climate applications, e.g. the relation between climate and topography found by Peron et al. (2014), dynamics of the sun activity using visibility graphs (Zou et al., 2014) and the prediction of extreme floods Boers et al. (2014).
Extremes like heat periods are defined as events which coherently exceed a threshold over a certain time-space domain.From a complex network perspective the node degree describes correlation (above a threshold) of data also on a time-space domain.Therefore, the area averaged node degree and thus the link strength could be an indicator for extreme events like heat periods.In this paper, we exploit this idea and show that its skill is similar to the skill of the standard approach and has the potential to improve the prediction of heat periods on time scales up to a decade.
In Sect. 2 we introduce the daily maximum temperature data used in this study.Section 3 describes our approach, which includes the preparation of the data, the definition of heat periods and the construction of time evolving climate networks standard approach and in some regions in Europe can be the better estimator of heat periods.Finally, we give the conclusions and an outlook in Sect. 5.

Data
We apply the climate network approach to a decadal prediction ensemble generated within the German research project MiKlip.The regional climate model COSMO-CLM (Consortium for small scale modeling in climate mode), called CCLM hereafter is described in Doms and Schättler (2002).CCLM has been used in numerous studies recently e.g. in Kothe et al. (2014); Dosio et al. (2015), a comprehensive overview can be found here: http://www.clm-community.eu.CCLM has been used to downscale global decadal predictions from the MPI-ESM (Stevens et al., 2013) global model.
From a suite of different decadal prediction experiments we have selected the socalled regional baseline 0 ensemble.This ensemble consists of 10 members each, covering the period 1961-2010 for the European region (according to CORDEX-EU Jacob et al., 2013;Giorgi et al., 2009) on a 0.22 • grid.This ensemble has already been used by Mieruch et al. (2014).
The regional baseline 0 ensemble (based on the global MPI-ESM model) has been initialized every 10 years (1961,1971,1981,1991,2001) In the study presented here we use daily maximum near-surface temperatures from the CCLM model and from the E-OBS v8.0 gridded climatology (Haylock et al., 2008) for the European continent. For

Method
Our hypothesis is that complex network measures may be complementary or even better estimators for climate extremes than standard measures like absolute threshold exceedances.
As mentioned before, we use the case of heat periods to illustrate the method.
The standard estimator for heat periods according to the WMO is that the daily maximum temperature is 5 K above the 1961-1990 mean maximum temperature at five consecutive days at least (Frich et al., 2002).Thus, the standard approach would be to count the heat periods e.g. for each year in an observational reference data set and similarly in the model data, both according to the WMO definition.
As an alternative heat period estimator, we propose to use the time varying link strength W τ (τ represents the years) of a network, based on modeled daily maximum temperature time series.The link strength W τ is the correlation threshold between time series, which is needed to construct a network of a given edge density (more details below).Accordingly we want to show that W τ has the potential to be at least as good as, or even a better estimator for observational heat periods than the standard estimator.This approach is similar to that used by Ludescher et al. (2013), who forecasted El Niño events using the link strength of a network and showed the superiority to standard sea surface temperature predictions by state-of-the-art climate models.
Figure 2 illustrates the motivation schematically, assuming that one heat period has actually occurred, and assuming that the model has a certain prediction skill to detect the signal out of the noise.Figure 2a depicts that using the standard approach the model correctly detects one heat period above the threshold.In Fig. 2b the model detects a signal, but this signal is too weak to cross the threshold, thus no heat period would have been detected and the model underestimates the number of heat periods.Introduction

Conclusions References
Tables Figures

Back Close
Full ones in Fig. 2. Now, the link strength of a network would be given by the correlation between these coherent time series.Generally, the signals in Fig. 2a-c look quite similar and the link strength of the network would thus be very similar in all three cases.Whereas the standard approach would correctly predict the heat period in only one case (Fig. 2a), the networks link strength would correctly predict the heat period in all three cases, given a proper relation between link strength and heat periods.
To test the relation in principle, we created 100 artificial time series (Gaussian noise) and included successively 0-9 heat periods.Figure 3a shows such a time series with three artificial heat periods indicated by the dashed lines.In a following step, we calculated the mean correlation (link strength) between these 100 coherent time series dependent on the number of included heat periods depicted in Fig. 3b.As can be seen, the relation is nearly linear, thus more heat periods are connected with a larger link strength.Note that Fig. 3b is not a calibration curve for real data, because we simply used Gaussian noise to create the time series.
It is clear that the argumentation above concerning the link strength as a heat period estimator is quite simplistic, but it elucidates our approach and the main idea.
To apply the method we proceed as follows.Suppose we have initialized our climate model in the year 2001 with the ocean, soil, ice and atmospheric state at that time.Accordingly the climate model runs freely for 10 years, i.e. a retrospective decadal climate prediction.Now we are interested in the capability of the model to represent heat periods in summer.Based on the standard approach of counting heat periods, according to the WMO definition, we could determine the prediction skill of the model in forecasting (hindcasting) the number of heat periods.Our approach, in contrast, is to create a time-evolving complex network with fixed edge density (Berezin et al., 2012;Radebach et al., 2013;Ludescher et al., 2013;Hlinka et al., 2014) from the modeled daily maximum temperature time series and use, as mentioned, the dynamics of the link strength W τ as a heat period estimator.
Before using the complex network approach it is necessary to remove the stationary biases and variabilities from the climate time series (Donges et al., 2009).Introduction

Conclusions References
Tables Figures

Back Close
Full We remove the bias, trend and the average annual cycle by subtracting a standard linear regression including a Fourier series from the time series: where y i (t) represents daily maximum temperature from 1961 to 2010, µ i is the intercept, ω i is the linear trend and α i ,j and β i ,j represent the Fourier coefficients.
Equation ( 1) is evaluated individually at each grid point i = 1, . .., N.Then, the months from June to September are selected because we are interested in summer heat periods.
In this study, we define a heat period for E-OBS observational data as a time range when the anomaly maximum temperature exceeds a threshold of 3 K at five consecutive days at least, and additionally includes not less than 20 % of the grid points in the area of interest.This choice has been made to observe events frequently enough for reliable statistics while simultaneously ensuring important impacts.To account for the inherent model bias it is essential to adjust the temperature threshold to the model climate.Thus, we estimate the percentile P 3 K corresponding to the 3 K E-OBS threshold for the complete time from 1961 to 2010 and the area of interest.Accordingly, we use this percentile as the threshold for heat periods for the model data.Table 1 shows this threshold in K for the 8 Prudence regions, estimated from the CCLM ensemble means.As could be expected, the threshold is higher for low latitudes.
Following our aim to use a network measure as a heat period estimator we construct a complex network from the daily maximum temperature model data.Here we use an undirected and unweighted simple approach.Thus, the network consists of vertices V , which are the spatial grid points of our temperature data, and edges (connections) E , which are added between vertices and represent the statistical interdependence between the anomaly daily maximum temperature time series.This complex climate Figures

Back Close
Full network can be represented by the symmetric adjacency matrix A with: where i and j represent the vertices, i.e. time series at grid points i , j = 1, . .., N.
Two grid points are connected if the correlation between their time series exceeds a predefined threshold.The statistical interdependence between pairs {i j } (selfloops {i i } are not allowed) of time series is measured using the Pearson (standard) correlation coefficient (Donges et al., 2009).From sensitivity studies we found that correlations between time series in the order of 0.7-0.9yield acceptable results.That means that using these thresholds, we observe patterns with not too few and not to many connections.This is important in order to resolve temporal dynamics of the network.However, since we want to analyze different regions in Europe and to generate comparable results we decided to alternatively create our networks with a constant edge density (ratio of number of actual connections to maximum number of connections) of where E is the number of edges and k i is the mean node degree with which gives the number of connections of a vertex i .
We implemented an iterative correlation threshold adaption method, which creates networks for each area in Europe and each year with a constant edge density of ρ = 0.3 ± 0.0005.Introduction

Conclusions References
Tables Figures

Back Close
Full In a similar way as Berezin et al. (2012) we analyze the temporal variation of the link strength W τ , i.e the correlation threshold between time series (grid points) for a single year τ (summer) from 1961 to 2010.Thus, instead of using the node degree as an estimator of heat periods we use the link strength W τ .
Using the definitions above, we finally construct a network for the summer months of each year based on anomaly maximum temperature model data.The quantity whose year-to-year variation we are interested in is the link strength W τ ; however, since we are interested in decadal variability, and since we do not expect the model to represent the year to year fluctuations, we applied a 10 year moving average filter to the data.Since the CCLM model has been initialized every decade (1961, 1971,..., 2001) we apply the filter only within a decade in order to avoid transferring information between decades.
To quantify the prediction skill, we calculate the absolute mean difference between E-OBS heat periods o and CCLM heat periods m and CCLM link strength W τ .To be comparable we normalized the time series to the range {0, 1} by a subtraction of the minimum of the time series and accordingly a devision by the maximum for the whole time span, e.g.where τ represents the years within a decade and the bars in the above equations denote temporal averages.Thus, if the absolute mean difference is about 0, observations and model agree well, whereas a difference of about 1 denotes the maximum discrepancy.

Results
Figure 4 shows that the link strength W τ is a suitable estimator of heat periods for France (Prudence region 3). Figure 4 depicts the number of observed heat periods (solid line) and the corresponding link strength (dashed line) retrieved from the complex evolving network, both from E-OBS data.Figure 4 shows that the network contains climate information in the sense that the dynamics of the link strength W τ is very similar to the dynamics of heat periods, both based on the same data.So, the link strength can here be considered as equivalent to the standard heat period estimator.Prudence region 5 (Scandinavia) is a region where the network method performs better than the standard method (Figs. 5 and 6). Figure 5 shows the E-OBS number of heat periods (black) and the CCLM ensemble mean number of heat periods (blue) for France together with the interquartile range (25th and 75th percentiles) IQR.As can be seen, the model cannot follow the observational reference, especially from 1970-2010, well.By contrast, the CCLM link strength, shown in Fig. 6 (red) follows the decadal variability of the E-OBS heat periods dynamics well.Thus, in this case the network measure is the better heat period estimator.In order to see how the prediction skill of the standard as well as the network heat period estimators vary with the region considered, we performed the same analysis as above for the 8 Prudence regions in Europe and for the 1960s, 1970s, 1980s, 1990s and 2000s.To summarize our results we calculated as the prediction skill the absolute mean difference within a decade between E-OBS heat periods and CCLM heat periods (Eq.6) and E-OBS heat periods and CCLM link strength (Eq.7) based on normalized time series.The prediction skill "M" is also included in Figs. 5 and 6. performs better regarding the 8 regions (columns) and 5 decades (rows).Blue color in Fig. 7 indicates that the network approach performs better (M r d (W ) < M r d (m)) and red color stands for a better performance of the standard method (M r d (W ) > M r d (m)).White boxes in Fig. 7 denote a tie between the methods in the case of too small differences (|M r d (W )−M r d (m)| ≤ 0.05).Interpreting the matrix of Fig. 7 we conclude that the network method is superior in 3 regions (4, 5, 6), the standard approach is superior in 2 regions (3,8) and in 3 regions (1, 2, 7) we observed a tie, i.e. no clear result.

Conclusions and outlook
We presented a novel approach examining climate predictions using a complex network analysis.We have investigated the predictability of the slow dynamics of the occurrence of heat periods in Europe based on daily maximum near-surface temperature data.
We found that the network approach has similar skill as the standard method and has the potential to improve the predictability of heat periods for some European regions.Picking up our hypothesis and simplified argumentation from Sect. 3, the crucial point why we detect heat periods with the network link strength is that heat periods are cooperative events in space and time.Thus, the link strength can be used as an estimator of heat periods.The drawback of the standard method is most probably the inflexible threshold for the detection of heat periods (cf.Fig. 2).If the climate model contains the signal of a heat period, but with a slightly too small amplitude, the threshold will not be crossed and no heat period will be detected.In contrast, the complex climate network does not depend on such fixed thresholds, and can use this information, which makes it the more robust estimator of heat periods.At present, we have no explanation for the dependence of the skill on the region.
The general prediction skill of climate in Europe using standard measures is still moderate.In this sense our work adds new aspects to our previous study (Mieruch et al., 2014)  skill with region and decade.In essence, we found regions and decades in Europe where our climate model output, or more specifically the used network estimator, follows the slowly evolving dynamics of observed heat periods.We also found regions and decades, where the network estimator is not able to represent the observational reference.
Concluding, our approach shows that the complex climate networks approach yields meaningful climate information and can complement standard skill measures within the framework of climate prediction.Furthermore, our study has given examples that the complex climate network approach has the potential to improve climate predictions of extremes.
Several research questions remain and arise.From the network perspective it would be interesting to analyze other network measures like clustering, similarities or path lengths and how they are connected to climate evolution.Additionally, the incorporation of other relevant variables like precipitation, wind or soil moisture into the network is an appealing aspect.From a physical or climatological point of view it is important to understand why the network measures are able to represent climate dynamics, which could also contribute to a better understanding of the sources of decadal predictability.
Thus, the incorporation and investigation of processes like the AMOC, PDO or NAO together with complex networks and climate prediction might be an option for the future.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full  .Rank matrix of the performance of the two methods.Blue: network approach performs better, red: standard approach performs better, white: tie.
Discussion Paper | Discussion Paper | Discussion Paper | Eade et al. (2012)  analyzed the predictability of temperature and precipitation extremes in a global model and found a moderate but significant skill (correlation) for seasonal extremes.They also find skill beyond the first year, but this skill arises from external forcing.Thus,Eade et al. (2012)  compared initialized climate predictions with uninitialized projections to evaluate the skill gained by initializing and excluding the external forcing.They found that the "... impact of initialization is disappointing".Discussion Paper | Discussion Paper | Discussion Paper | . The results, shown in Sect.4, indicate that the network measure is equally skillful as the Discussion Paper | Discussion Paper | Discussion Paper | . Within a decade the CCLM model runs freely, except for the prescription of the atmospheric boundary conditions by the global MPI-ESM model.More details on the development of the ensemble and the initialization can be found in Matei et al. (2012), Müller et al. (2012), Mieruch et al. (2014).
our comparison, we use the so-called Prudence regions http://prudence.dmi.dk/,namely British Isles, Iberian Peninsula, France, Central Europe, Scandinavia, Alps, Mediterranean and Eastern Europe shown in Fig. 1Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | mean difference between heat periods for a region r and a decade d is given by Discussion Paper | Discussion Paper | Discussion Paper |

Figure
Discussion Paper | Discussion Paper | Discussion Paper | and also the work of Eade et al. (2012) who found a strong variation of Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Table 1 .
Ensemble mean variation of the temperature threshold calculated for heat periods in CCLM data.