Characterizing the Evolution of Climate Networks

Complex network theory has been successfully applied to understand the structural and functional topology of many dynamical systems from nature, society and technology. Many properties of these systems change over time, and, consequently, networks reconstructed from them will, too. However, although static and temporally changing networks have been studied extensively, methods to quantify their ro-bustness as they evolve in time are lacking. In this paper we develop a theory to investigate how networks are changing within time based on the quantitative analysis of dissimilari-ties in the network structure. Our main result is the common component evolution function (CCEF) which characterizes network development over time. To test our approach we apply it to several model systems, Erd˝ os–Rényi networks, analytically derived flow-based networks, and transient simulations from the START model for which we control the change of single parameters over time. Then we construct annual climate networks from NCEP/NCAR reanalysis data for the Asian monsoon domain for the time period of 1970–2011 CE and use the CCEF to characterize the temporal evolution in this region. While this real-world CCEF displays a high degree of network persistence over large time lags, there are distinct time periods when common links break down. This phasing of these events coincides with years of strong El Niño/Southern Oscillation phenomena, confirming previous studies. The proposed method can be applied for any type of evolving network where the link but not the node set is changing, and may be particularly useful to characterize nonstationary evolving systems using complex networks.

However, although static and temporally changing networks have been studied extensively, methods to quantify their robustness as they evolve in time are lacking. In this paper we develop a theory to investigate how networks are changing within time based on the quantitative analysis of dissimilari-10 ties in the network structure. Our main result is the common component evolution function (CCEF) which characterizes network development over time. To test our approach we apply it to several model systems, Erdős-Rényi-networks, analytically derived flow-15 based networks, and transient simulations from the START model for which we control the change of single parameters over time. Then we construct annual climate networks from NCEP/NCAR reanalysis data for the Asian monsoon domain for the time period of 1970 -2011 C.E. and use the CCEF 20 to characterize the temporal evolution in this region. While this real-world CCEF displays a high degree of network persistence over large time lags, there are distinct time periods when common links break down. This phasing of these events coincides with years of strong El-Niño/Southern Os-25 cillation phenomena, confirming previous studies. The proposed method can be applied for any type of evolving network, where the link but not the node set is changing and may be particularly useful to characterize nonstationary evolving systems using complex networks. 30

Introduction
Networks are practical representations for complex systems with interacting components and have been used to study phenomena in sociology, engineering and natural systems 35 (Barthélemy, 2011;Menck and Kurths, 2012;Palla et al., 2005;Holme et al., 2004). Complex network techniques, based on statistical associations between climate parameter time series at different points on Earth, have yielded new insights in the investigation of climate dynamics (Tsonis and 40 Swanson, 2008;Donges et al., 2009;Paluš et al., 2011). Such climate networks have been used for detecting longrange correlations, or teleconnections (Martin et al., 2013;Barreiro et al., 2011), and studying such phenomena such as the El-Niño/Southern Oscillation (ENSO, Gozolchiani et al.,45 2008; Deza et al., 2013) and the Indian Monsoon system (Rehfeld et al., 2013;Malik et al., 2011;Stolbova et al., 2014). In particular, Tsonis and Swanson (2008) found changes in the global network topology during El Niño events, with significantly fewer links and lower clustering coefficients and 50 inferred a lower predictability for El Niño over La Niña years. Using climate networks, Yamasaki et al. (2008) and Gozolchiani et al. (2008) also found ENSO influence on regional atmospheric processes in non-ENSO regions. Temporal and spatial variability of climate, and thus climate net-55 work structure, are of increasing interest considering ongoing environmental changes, and climate networks as evolving in time are still an open subject. The spatial-temporal developments in a given network set can be too complex to be captured by eye, and systematic approaches to quantify 60 changes are needed. While Berezin et al. (2012) investigated the origins of the climate network stability such as the spatial embedding and physical coupling between climate in different locations using the correlation between correlation matri-ces, other studies describe how the network graph is changing over time to understand the behaviour of the underlying dynamical system (e.g. Rehfeld et al., 2013). Various aspects of temporally changing networks have been considered for sociological and biological networks. Albert and Barabasi (2000) analysed random network growth and evolution in response to the addition or rewiring of links between nodes and found that the graph topology changed depending on the frequency of link changes. Fu et al. (2009) tracked node function changes using a stochastic blockmodel for evolving networks to investigate evolutionary effects in 75 email networks and gene regulation.
One of the most common dissimilarity measures which has been used for network comparison is Hamming distance. It was introduced by Hamming R.W. (1950) as a measure for comparing the strings of symbols and was used for mea-80 suring the distance between the networks. Given the adjacency matrices A N and A M of two graphs N and M , their Hamming distance is determined from the sum over the number of links which are found in one, but not the other network: However, although 85 Hamming distance can be generalized for directed networks with possibly differing node numbers, two networks M and N may have the same Hamming distance to the fixed network K while having different topology themselves. The Hamming distance therefore may not enough to detect topo-90 logical changes.
Here we propose a common component evolution function (CCEF) based on the common set of links in pairs of networks to evaluate graph changes quantitatively in space and time. We characterize the method using Erdős-Rényi net-95 works (Erdős and Rényi, 1959), analytically derived flow networks (Molkenthin et al., 2014a,b) and transient simulations from the START model (Rehfeld et al., 2014) for which we control changes of individual parameters over time. Then we construct annual climate networks from NCEP/NCAR 100 reanalysis data for the Asian monsoon domain and use the CCEF to characterize the temporal evolution in the monsoon system.

105
We consider unweighted and undirected networks, for which n nodes are joined in pairs by edges, or links. The linking structure is given in the adjacency matrix A, a binary n × n matrix with zeroes on the diagonal, as we do not allow for self-loops. An element is non-zero, A ij = 1, if and only if 110 the vertices i and j are connected, and zero otherwise. Let us consider a linearly ordered set of T evolving in time networks: N 1 , . . . , N T . Then the common component network for two of these networks N i and N j , CC(N i , N j ), is a network on the same nodes, where the set of edges is 115 present in both original networks. If N i and N j have adja-cency matrices A i and A j , the number of edges in the common component network CC(N i , N j ) is the number of nonzero elements above the diagonal in the binary sum of adjacency matrices A i and A j . This common component net-120 work can be generalized for any k+1 networks by induction: and is in the following normalized to [0, 1] using the maximal number of links in the networks.
In analogy to covariance estimation (Chatfield, 2004) and 135 similar to Berezin et al. (2012), we take the mean over the CCFs with the same time lags to estimate the non-normalized common component evolution function, CCEF * , as where δ is the time lag between the networks, and δ ∈

140
[0, T − 1]. The maximum value of the CCEF * is given by CCEF * (0) for zero lag, as an average number of links in the set of network and we use it to obtain the normalized common component 145 evolution function which we will use exclusively in the following. As an estimation of the CCEF uncertainty we use the standard deviation over all CCEF values.

Testing the method on random networks
To test our method we generate a set of T Erdős-Rényi graphs (Erdős and Rényi, 1959) with a fixed number of n nodes and a fixed connection probability p. We artificially impose a linear ordering on the set, such that we can in-155 dex them with i ∈ (1, T ). We compute the CCEF for Erdős-Rényi-graphs with 100 nodes and link probabilities of 0.3, 0.5 and 0.9. The resulting functions, shown in Fig. 1, decrease from 1 to a plateau at CCEF(δ) ≈ p for δ > 0 for each link probability p. For this example we can analytically compute the expected CCEF, since for δ = 0 each network is compared with itself and therefore CCEF = 1. For all other values of δ two random matrices with n nodes and connection probability p are compared. Then the number of totally possible links is n(n − 1)/2, and the expectation value of the 165 number of links in each of the networks is pn(n − 1)/2. As the probability of each of the edges in one network to also appear in the other network is p, the total number of common links is p 2 n(n − 1)/2, which with the normalization leads to f (p), the ratio of total number of common links and the ex-170 pectation value of the number of links to be The CCEF for each linking probability therefore lies close to the expected value p.

Test models 175
To characterize our approach further we investigate simulations of more complex, spatially embedded processes. We obtained networks (i) analytically from flow fields, as described in (Molkenthin et al., 2014a) and (ii) from the Spatio-Temporally Autocorrelated Time series model START.

Networks from flows
The flow networks are constructed directly from a velocity field using a correlation measure based on the temperature profiles resulting from a temperature peak via advection and diffusion (Molkenthin et al., 2014b,a). The velocity function 185 considered here is: v each value of c we obtain a correlation matrices C 1 , ..., C 10 , and thresholding these matrices by different critical values we obtain set of adjacency matrices.

Networks from the START model
As a more complex test case we consider two transient simu-

195
lations for the START model (Rehfeld et al., 2014). Networks generated from START undergo a distinct transition when the forcing parameter F is changed: For F = −1 the network is partitioned into two vertical connected areas. For F = 0 horizontal cross-links have appeared and link the two sections.

200
At maximal forcing, for F = 1, there is one large, horizontally oriented component. We performed two transient simulations with a 6 × 7 sampling grid (Rehfeld et al., 2014) for 20000 timesteps and 100 ensemble members each. In the first run the forcing parameter was increased linearly from the 205 start to the end of the simulation. In the second run we periodically changed the forcing parameter F (t) = sin(t/2000). Networks were constructed based on the 20% strongest links in the correlation matrices obtained for each 100-step-long time window. Due to the stochastic component, networks 210 constructed for different ensemble members, but for the same time period may be quite different, networks for different periods of same ensemble member may be quite similar.

Asian monsoon data
In a real-world application we used daily NCEP/NCAR re-215 analysis temperature anomaly data (NOAA) for the Asian monsoon domain for the years 1970-2010 C.E. . The spatial resolution was 2.5 • × 2.5 • , covering the area between 2.5 • S to 42.5 • N and 57.5 • E to 122.5 • E, resulting in time series for 468 nodes. Networks were constructed using Pearson corre-220 lation in windows for each full year and by thresholding the correlation matrix such that we obtain a link density of 5%. The same dataset and time period was used in Molkenthin et al. (2014b) to investigate the influence of changing node topologies in space on the estimates of node degree and be-225 tweenness. 3 Results

Flow-networks
We computed the CCEF for flow-networks with linearly increasing flow-width parameter c. As Fig. 2 shows, the com-230 mon component size decreases monotonously with the width parameter difference of the networks. The higher the threshold of the correlation matrix is, the faster the CCEF decays but the general shape does not change.

START-model networks 235
The START model undergoes a more distinct transition from a network with two distinct parts through a connected stage with three regions to one single component (Rehfeld et al., 2014) in response to a single forcing parameter F . To characterize the CCEF response to different network evolution 240 patterns we use two test cases, in which we vary F from its minimum to its maximum. In the first example, the forcing parameter is varied linearly along time. The CCEF response is a slow decline from its maximum CCEF(0) = 1 to a minimum value CCEF(99) ≈ 0.4, as shown in Fig. 3. In the sec-245 ond test the forcing parameter was varied periodically as a function of time, F = sin( 2π P t), with P = 10. In response to the sinusoidal forcing, periodic behavior is also observable in the CCEF and with the same period as the forcing parameter.

Application to the Asian Monsoon domain 250
Finally, we used the CCEF to investigate the evolution of climate networks from observations. The networks were constructed using a link density of an annual basis for 41 years, 1970-2011 C.E. The obtained CCEF in Fig. 4 is reminiscent of the Erdős-Rényi networks in Sec. 2.1, with an initial quick 255 decline followed by a plateau. However, while in the case of Erdős-Rényi networks (Fig. 1) the baseline is equal to the set link density, it is significantly higher than the link density here. We thus conclude that a high degree of persistence and a low amount of spatio- The spatial domain selected for this study is the host of very distinct seasonal dynamics during the Asian Monsoon seasons, see e.g. Wang (2006). At inter-annual timescales, 265 however, teleconnections such as that to the ENSO phenomenon plays a significant role (Turner and Annamalai, 2012;Clarke, 2008). In order to identify reasons for the variability of climate networks in this region we compared the variation of common component functions CCF (N i 1970,2011], where i is kept fixed. This way, we obtained a common-links-recurrence-diagram, illustrated in Fig. 5a, with maximum values on the diagonal. Each pair (i, j) for i, j ∈ [1970,2011] corresponds to the value of the common component function CCF(N i , N j ), as 275 in Eq. 3. Fig. 5a shows rows and columns with distinct lower values for the CCF. In these lines the overall sum, S i = j CCF(N i , N j ), takes smaller values in the years i ∈ 280 (1971 − 1973, 1975, 1984, 1989, 1993, 1999). We compared this sequence with a list with strong El-Niño phenomena according to the El-Niño 3.4 index (Trenberth, 1997(Trenberth, ), and observed that 1972(Trenberth, , 1982(Trenberth, , 1988(Trenberth, , 1992(Trenberth, and 1997 were the strongest ENSO event years in this time period. At the same 285 time, the correlation between the CCF functions of these years and all others, given in Fig. 5b, also takes on very low values. Around stronger El-Niño years the surface temperature networks have less common links, and the correlation of their CCF with all others is considerably lower. ENSO events 290 occur during the northern hemisphere winter season and thus the main effect of the link breakdown in our networks occur in the year after the event started.

Discussion
The CCEF enables us to investigate the evolution of linearly 295 ordered, or evolving, network sets quantitatively. We tested its response to three different types of model networks and find, that their response enables us to characterize their evolution. Unlike random networks, the flow network CCEF level is, 300 within limits, not related to the threshold value but displays a deterministic decrease of network similarity: two flow-networks separated by bigger index lag have less links in the intersection, hence the common component function CCF(N i , N i+δ ) decreases with the growth of δ, and the in-305 tersection of two flow-networks decreases with the difference in the width parameter. Furthermore we find that the links in a network set with higher threshold are more persistent.
The START model examples, on the other hand, illustrate the distinct difference between slow, linear changes of the 310 processes generating the networks over time -and periodic, rapid transitions. While in the first case the CCEF decreases slowly, and only considerably for large time difference, in case of periodic and rapid transitions the CCEF response is also periodic over the time lag. In this case it is particularly 315 important that the time window a single network corresponds to is sufficiently small compared to the ongoing evolution to avoid aliasing effects which would occur in case of window width as a multiple of the forcing period and to be able to detect the changes at all.

320
The year-long daily temperature anomaly networks of the Asian Monsoon domain show a high degree of spatiotemporal persistence. This is consistent with the results of Berezin et al. (2012), who found similarly high values over large regions of the South Atlantic and the Equatorial Pa-325 cific. While the general shape of the CCEF in Fig. 4 agrees with that for the Erdős-Rényi random networks (Fig. 1), the CCEF fluctuates around a much higher level compared to the link density. This points towards a highly non-random, deterministic gen-330 eral structure in the network on which the inter-annual variability is imprinted. Links in this network are comparatively stable but loose some of their stability when the external disturbance of an El-Niño-event is added. This agrees well with the findings of Gozolchiani et al. (2008) and Tsonis and 335 Swanson (2008), who showed that, for global networks, fluctuations, or "blinking" of links could be related to the global signature of ENSO variability. Therefore, despite the large persistence in the monsoon network, the monsoon-ENSO teleconnection is also visible in the common link recurrence 340 (Fig. 5b).
To check whether the main changes in the climate networks investigated here occurred due to changes in the degrees of "supernodes" (nodes with higher degree), we plotted the variability of the degree for each node in Fig. 6a) and find that, 345 indeed, degree variability is high (low) where node degree is high (low). Computing the correlation between time series of node degrees, we obtain a network of degree variability. Fig. 6b shows the degree of the resulting graph. Where the degree variability is low, over mainland India, we find high 350 degrees in the degree network, which means that links in this region are mostly persistent. Where the degree variability is high, over the adjacent Indian Ocean and the South China Sea, we also observe high degree values in the network of the degrees, suggesting that degree changes here are large, 355 but synchronized. In the northern part of the Asian monsoon domain considered here, spanning from Afghanistan through Pakistan, the Himalayas to China, we find a higher degree variability with less synchronized degree changes. Desynchonization in this region may occur due to the additional 360 effect of the continental Westerlies and the large altitudinal gradients.

Conclusions
We have presented a generic approach to characterize the evolution of networks. With model tests we established that 365 it is possible to use it to distinguish random, deterministic and periodic evolution behaviors in a set of networks. The new quantity to measure variability and persistence in networks is suitable for different network types. For example, the network set may be linearly ordered by time -or by pa-370 rameter difference. The method can be extended in a straightforward manner, but it currently requires that the node structure and link density remain constant. Applying the CCEF analysis to data from the Asian Monsoon domain we found that El-Niño years are accompanied by a distinct network im- print, leading to small common components with non-ENSO years and high agreement with ENSO years. In the future, the CCEF could be a particularly useful tool in the investigation of change points in network evolution.