Nonlinear Processes in Geophysics Record-breaking earthquake intervals in a global catalogue and an aftershock sequence

For the purposes of this study, an interval is the elapsed time between two earthquakes in a designated region; the minimum magnitude for the earthquakes is prescribed. A record-breaking interval is one that is longer (or shorter) than preceding intervals; a starting time must be specified. We consider global earthquakes with magnitudes greater than 5.5 and show that the record-breaking intervals are well estimated by a Poissonian (random) theory. We also consider the aftershocks of the 2004 Parkfield earthquake and show that the record-breaking intervals are approximated by very different statistics. In both cases, we calculate the number of record-breaking intervals ( nrb) and the record-breaking interval durations 1trb as a function of “natural time”, the number of elapsed events. We also calculate the ratio of record-breaking long intervals to record-breaking short intervals as a function of time, r(t), which is suggested to be sensitive to trends in noisy time series data. Our data indicate a possible precursory signal to large earthquakes that is consistent with accelerated moment release (AMR) theory.


Introduction
A record-breaking event is defined to be one that is larger (or smaller) than all previous events.A typical example is the sequence of record-breaking temperatures (either highest or lowest) on a specified day of the year at a specified monitoring station.The rate at which records are broken is an important characteristic of the sequence; studies involve both the number of record-breaking temperatures and their values.The ratio of the number of record-breaking high temperatures to record-breaking low temperatures has been been interpreted as a measure of global warming (Meehl et al., 2009).
Correspondence to: M. R. Yoder (yoder@physics.ucdavis.edu)Tata (1969) introduced a basic theory of record-breaking statistics for events that occur randomly.Tata's paper addressed record-breaking statistics for a sequence of variables drawn from a continuous, independent identically distributed (iid) process.Glick (1978) published several applications of Tata's method, including a brief study of daily temperatures.Benestad (2004Benestad ( , 2008) ) and Redner and Petersen (2006) further developed meteorological applications in the context of global warming; Vogel et al. (2001) applied the method to flooding in the United States, and Van Aalsburg et al. (2010) applied the method to global earthquake magnitudes.
Time series, such as maximum or minimum temperatures on a specified day of the year, are not truly random (iid) sequences.Important deviations include temporal correlations and temporal trends.Temporal correlations, in many naturally occuring time series, exhibit long-range correlations and self affinity (Turcotte, 1997).A standard measure of these correlations is the power-law dependence of the power spectral density S on frequency f S ∼ f −β (1) If β = 0, the time series is a white noise, comprised of a random (iid) sequence of values.In the range 0 < β < 1 the correlations are weak and the time series is weakly stationary.For the daily time series of temperatures, we typically observe β ≈ 0.5, a Hurst exponent, H u ≡ (β+1) 2 = 0.75 (Pelletier and Turcotte, 1999).Simulations show that for this value of β the iid theory of record-breaking statistics is a good approximation for the weakly correlated time series.
A second deviation of a time series from a random (iid) sequence involves a trend in the expected values.Simulations show that record-breaking statistics are sensitive to such trends.A specific example is the association of recordbreaking temperature statistics with global warming.Benestad (2004Benestad ( , 2008) ) studied monthly maximum temperatures on a global basis.The number of record-breaking temperatures were determined both with time running forward and with time running backwards.Significantly more forward recordbreaking temperatures were found than backward recordbreaking temperatures.This ratio can be quantitatively related to global warming.Redner and Petersen (2006) conducted a similar study, calculating the numbers of record breaking maximum and minimum temperatures in Philadelphia for each day of the year over a 120-year period.They present a framework for record-breaking climatological analysis.
The concept of record-breaking events can also be applied to earthquakes.A catalogue, study area, starting time and minimum magnitude must be specified.Van Aalsburg et al. (2010) considered record-breaking magnitudes of earthquakes in the global Centroid Moment Tensor (CMT) catalogue with moment magnitude M W ≥ 5.5 over fifteen sequential, non-overlapping two-year periods between 1977 and 2006.For their study, a record-breaking earthquake magnitude is greater (or smaller) than the magnitude of any previous earthquake in the study region since the chosen starting time.Van Aalsburg et al. (2010) showed that the mean numbers of record-breaking earthquake magnitudes (n rb ) and mean record-breaking magnitudes (M rb ), determined from the CMT catalogue, agree closely with the iid theory.
The primary purpose of this paper is to determine whether record-breaking statistics can distinguish background sequences of main-shocks from correlated aftershocks.Because the frequency-magnitude distributions of main shocks and aftershocks are very similar, if not identical, recordbreaking magnitude statistics cannot be used to separate the two classes of earthquakes.Instead, we will utilize the record-breaking interval statistics.
We construct catalogues by selecting all earthquakes within a region, with magnitudes greater than a specified minimum value.We consider the sequence of interval times between successive earthquakes in our catalogues.Recordbreaking long intervals are the sequence of interval times longer than any previous interval times.Record-breaking short intervals are the sequence of interval times shorter than any previous interval times.The interval between the first and second earthquake is, by definition, the first recordbreaking long interval.The next interval, longer than this interval, is the second record-breaking long interval, and so on.Similarly, the first interval, between the first and second earthquakes, is by definition also the first record-breaking short interval.The next interval shorter than this interval is the second record-breaking short interval, and so on.Intervals can be taken either forward or backward in time.We first consider global earthquakes with magnitudes greater than 5.5 and show that the record-breaking intervals are well estimated by a Poissonian (random) theory.In this case, the number of record-breaking long intervals (n rb-long ) are statistically identical to the number of record-breaking short intervals (n rb-short ).
We also consider the aftershocks of the 2004 Parkfield earthquake and show that the record-breaking intervals are characterised by very different statistics.Because of the applicability of Omori's law, the interval times in an aftershock sequence become systematically longer.Thus, after the main shock, there are many more record-breaking long intervals than record-breaking short intervals (n rb-long > n rb-short ).
For both the sequence of global earthquakes and the sequence of aftershocks, we first determine the number of record-breaking intervals n rb , both long and short (n rb-long and n rb-short , respectively) as a function of "natural time" n, the number of elapsed intervals.Second, we determine the record-breaking interval durations, t rb , both long and short ( t rb-long and t rb-short ) as a function of "natural time" n.Third, we calculate the ratio of the number of recordbreaking long intervals to the number of record-breaking short intervals as a function of time r(t), which is suggested to be sensitive to trends in noisy time series data (Benestad, 2004(Benestad, , 2008;;Redner and Petersen, 2006).We will also show that our data indicate a possible precursory signal to large earthquakes that is consistent with the accelerated moment release (AMR) theory.

Record-breaking intervals in the CMT global catalog
We first calculate the record-breaking statistics of earthquake intervals for global earthquakes during the period January 1977-December 2006.We consider earthquakes with M W ≥5.5 from the CMT catalogue over windows of n=1024 intervals.Initially, we consider the 1024 intervals between the first 1025 earthquakes.
Within this subsequence, we calculate the number of record-breaking long and record-breaking short intervals (n rb-long and n rb-short ) separately for each n i = 2 i (i=0,1,2,...,10) elapsed intervals.Similarly, we calculate the record-breaking long and record-breaking short interval durations ( t rb-long and t rb-short ) as a function of n i .We advance the window one event at a time and repeat the above procedure to obtain 10 592 values for n rbi and t rbi (both long and short) for each n i .We then determine the means and standard deviations of n rbi and t rbi , for both the longest and shortest intervals in the sequences.The mean values of n rb as a function of the number of elapsed events (natural time) n are given in Fig. 1. Results are given for both longest record-breaking and shortest record-breaking interval times.The two results are almost identical and n rb ∼ ln(n) appears to be a good approximation.The mean lengths of both longest and shortest recordbreaking intervals t rb and their standard deviations are given as a function n in Fig. 2. Again, t rb ∼ ln(n) appears to be a good approximation.
Next, we generate a synthetic catalogue of random event intervals.To do this, we utilize the cumulative distribution of the interval times in a homogeneous Poisson process (HPP) given by We consider a sequence of intervals with the same number of events as the CMT catalogue that we used.For each interval, we determine F ( t) as a random number in the range 0 to 1 and solve Eq. ( 2) for the corresponding random interval t.The mean record-breaking numbers, n rb , and time intervals, t rb , are calculated from the synthetic catalogue as described above and included in Figs. 1 and 2. The CMT catalogue results are in good agreement with the random simulations.
3 Record-breaking events of an independent, identically distributed (iid) process We will now show that the numbers of record-breaking intervals n rb (n) , as a function of the number of intervals n, as given in Fig. 1, is well approximated by the statistical analysis of a random process.The basic theory for random record-breaking events was developed by Tata (1969) and was clearly explained by Glick (1978).Their results are valid for any random process that has a continuous distribution of values; the results are independent of the particular distribution of values.We will apply this iid analysis both to the distribution of maximum intervals and to the distribution of minimum intervals.We consider a sequence of random values, x i (i = 1,2,...,n), selected from a continuous distribution.The first element is always a record-breaking event.The second variable is larger or smaller than the first, with equal probability: Because the sequence is random, the probability that any one element x i occupies the j -th (j ∈i) position is 1/n.For n = 2, the probability that the larger element terminates the sequence is 1/2.For n = 3, the probability that the final element is the largest (or smallest) value is 1/3.Accordingly, the expected number of record-breaking events n rb in an iid sequence is: For large n, we have approximately where γ = 0.577215 is the Euler-Mascheroni constant.For example, with n = 100 we have n rb (100) = 5.187 from Eq. ( 4) or n rb (100) = 5.183 from Eq. ( 5).Even for n = 4, we have n rb (4) = 2.08 from Eq. ( 4) and n rb (4) = 1.96 from Eq. ( 5).It must be emphasized that the values given in Eqs. ( 4) and ( 5) represent the expected mean values for many realizations; Vogel et al. (2001) provide a more thorough analysis of the statistical moments of record-breaking sequences.The values from Eq. ( 4) are also included in Fig. 1 and are seen to be in good agreement with the global earthquake values and the random simulations.Next, we consider the statistics of record-breaking maximum intervals in an aftershock sequence.As a specific example, we consider the 2004 Parkfield, CA earthquake.We expect record-breaking behaviour in an aftershock sequence to deviate substantially from the stationary Poisson process.Specifically, we expect intervals to increase in time, when time is measured forward, according to the modified form of Omori's law (Shcherbakov et al., 2006) where n is the number of aftershocks as a function of time t after the main shock and τ (m c ) and c(m c ) are characteristic times obtained empirically from the data and depend on the minimum magnitude, m c , considered.We use the same definition of Parkfield aftershocks given by Shcherbakov et al. (2006).These are illustrated in Fig. 3; for this study, we obtained interval data from the Advanced National Seismic System (ANSS) catalogue.We start counting events from 0.01 day after the main-shock, to mitigate the effects of early aftershocks being masked by the coda of the main-shock, and measure time forward through 6 April 2009.We consider the record-breaking maximum intervals from a single pass through the entire aftershock sequence for minimum aftershock magnitude thresholds of m c = 1.5,2.0,and 2.5.The numbers of record-breaking maximum intervals n rb and the lengths of the record-breaking maximum intervals t rb are given as functions of the number of aftershocks n (natural time) in Figs. 4 and 5, respectively.To a first approximation, n rb ∼ n and t rb ∼ exp(αn) (where α is a fitting constant), in strong contrast to n rb ∼ ln(n) and t rb ∼ ln(n) shown for global earthquakes in Figs. 1 and 2.

Non-Homogeneous Poisson Processes (NHPP)
The probability of some interval duration t = t i+1 − t i is equal to the probability that zero events occurred between times t i+1 and t i .For a Poisson process, the cumulative probability distribution function (CDF) can be expressed as (Ross, 2003;Shcherbakov et al., 2005;Yakovlev et al.): where λ(v) is a rate.For the special case where λ(v) = λ 0 , a constant, we recover the homogeneous Poisson process (HPP), Eq. ( 4), where λ 0 = 1/ t .When λ is not constant, we say the Poisson process is non-homogeneous.Substituting Omori's Law, Eq. ( 6), for λ(v) into Eq.( 7) and assuming for simplicity p = 1, we integrate with the result Solving for t and replacing F ( t,t) with u, a random number in the range 0 to 1, we generate a time series from the relation: Taking the values τ (m c ) and c(m c ) for m c = 1.5, 2.0, and 2.5 from Shcherbakov et al. (2006), we produce a NHPP series of interval times for the period considered above (Shcherbakov et al., 2005(Shcherbakov et al., , 2006)).For each m c , we find the mean and standard deviation of n rb (n) and t rb (n) over 1000 simulations and compare the results with the Parkfield data in Figs. 4 and  5.In Fig. 4, the general dependence of n rb on n is approximately the power law with n rb (n) ∼ n 0.5±0.1 for the number of longest record-breaking intervals as a function of the number of intervals.The deviation between the observed data and the NHPP simulations can be attributed, in part, to aftershocks of aftershocks which we have not included in the application of Omori's law, Eq. ( 6).Clearly, a strong, late aftershock can introduce a sequence of short intervals which will delay the occurrence of the next record-breaking long interval.In Fig. 5 the general dependence of t rb on n is approximately exponential, t rb (n) ∼ exp(αn), where α is a fitting constant for the length of longest record-breaking intervals as a function of the number of elapsed intervals.

Record-breaking temperatures at the Mauna Loa Observatory (MLO), Hawaii
As an example of relating record-breaking events to trends, we will consider the statistics of the record-breaking maximum high and minimum low temperatures observed at the NOAA MLO, Big Island, Hawaii for the period 1977-2006.This observatory, remotely located and situated at an altitude of 3397 m a.s.l., provides a wide range of atmospheric data relatively unperturbed by local tropospheric, biospheric and anthropogenic activities (NOAA, 2008).Keeling et al. (1976) showed, from observations at MLO, that atmospheric levels of CO 2 are systematically increasing.
In this context, a new record-breaking high temperature occurs when the maximum temperature for a given day is higher than all subsequent maximum temperatures on that calendar day.Similarly, a new record-breaking low temperature occurs when a day's minimum temperature is lower than all preceding minimum temperatures.For each calendar day of the year, excluding leap years, we calculated the number of record-breaking high and record-breaking low temperatures since the same calendar day in 1976, our chosen starting date.We then averaged over the 365 days of each year to produce a mean number of record-breaking events as a function of time in one year increments: n rb,day (year) (10) In Fig. 6, we show the mean numbers of recordbreaking maximum high temperatures n rb,max (t) and record-breaking minimum low temperatures n rb,min (t) as a function of time measured forward from 1977 to 2006.Also included in Fig. 6 are the predicted values for an iid process from Eq. ( 4).We see that n rb,max (t) is systematically greater than n rb,min (t) .Following Redner and Petersen ( 2006), we introduce the ratio r of the values Values of this ratio as a function of time measured forward are given for the MLO data in Fig. 7.We find a near constant value in the range 1.13 < r < 1.15 for the period 1977-2006, indicating systematic global warming.

Ratios of record-breaking earthquake intervals
We now extend the concept of a ratio of record-breaking temperatures introduced in Eq. ( 11) to sequences of earthquake interval times.We introduce the ratio r(n,t) of record-breaking longer intervals to record-breaking shorter intervals where n is the number of events in the sequence, t is the time at which the sequence terminates and, unless otherwise  specified, we assume that time is measured forward.For our global study, illustrated in Fig. 1, we see that n rb-long (n,t) is statistically identical to n rb-short (n,t) for background seismicity, so r(n,t) is expected to fluctuate around 1.
For our Parkfield study, illustrated in Fig. 4, we see that n rb-long (n,t) is systematically greater than n rb-short (n,t), so r(n,t) is expected to be predominantly greater than 1.It may be possible to use this difference to separate aftershocks from background seismicity.
To explore this possibility, we consider earthquakes with minimum magnitude m c 1.5 and use a moving window of 256 intervals.We pick an event, an earthquake that occurred at t = t 0 , and consider the 256 intervals between the preceding 256 earthquakes.Starting with the first of these earthquakes, we obtain the number of record-breaking intervals Fig. 8.The ratios of record-breaking longest intervals to recordbreaking shorter intervals, r(t), in the Parkfield aftershock sequence (illustrated in Fig. 3) with minimum magnitude m c 1.5 are shown as a function of date.Large and small record breaking intervals are counted forward starting from the 256-th interval preceding the interval ending at the time t.Each value, r(n = 256,t i ) is smoothed by averaging over the preceding 16 values.Blue regions, where r(n,t) > 1, imply aftershocks, or at least some form of decreasing mean seismicity; red regions, where r(n,t) < 1, suggest an increasing rate of seismicity.n rb-long (256,t 0 ) and n rb-short (256,t 0 ).We obtain the value of r(256,t 0 ), from Eq. 12, and assign it to time t 0 .We repeat this process with subsequent earthquakes, t > t 0 , to obtain a time series, r(256,t 0 ).We first consider the area of Parkfield aftershocks defined in Fig. 3.The values of r as a function of time are given in Fig. 8.We see that, after the Parkfield earthquake, the values of r are predominantly greater than unity, as expected.
We next consider earthquakes for the same period of time and the same minimum magnitude in a 4 • ×4 • region centered on the Parkfield epicenter.The values of r as a function of time are given in Fig. 9; a random-like behaviour is clearly illustrated.Using the broader catalogue, the Parkfield aftershock is obscured, presumably by aftershocks from uncorrelated earthquakes.This is consistent with our expectations from Figs. 1 and 4, which suggest that aftershock sequences produce more large than small interval records compared to a broader, background catalogue.
The variability and short periods of r < 1 after the mainshock, in Fig. 6, presumably indicate aftershocks of events in the primary aftershock sequence (aftershocks of aftershocks).Of particular interest is the period preceding the Parkfield main-shock where consistently r < 1.Early results indicate that r < 1 indicates some sort of precursory seismic acceleration; further study will consider whether this is a systematic phenomenon that can be quantified.It may be possible to tune this method to resolve events of different magnitudes by varying the number of events in the recordbreaking window or the spatial geometry of the area being considered.

Conclusions
We have studied the record-breaking statistics of two very different earthquake catalogues -a global dataset taken from the CMT catalogue and an isolated aftershock sequence, taken from the ANSS catalogue for the 2004 Parkfield earthquake.In each case, we find that the number of recordbreaking intervals is consistent with an established theory and with simulations.Specifically, we find that the global catalogue produces record-breaking behaviour that is well estimated by a homogeneous Poisson process (HPP), where n rb (n) ∼ ln(n).For an isolated aftershock sequence, in which intervals become systematically longer with time, we find that n rb (n) ∼ n 0.5±0.1 , which is in reasonable agreement with a non-homogeneous Poisson process (NHPP).Our results are consistent with the sensitivity of record-breaking statistics to trends, as discussed in the introduction.For the examples given above, we have shown a strong distinction between the record-breaking statistics of a large global earthquake catalogue and a well-defined aftershock sequence.
This method shows promise as a simple, computationally efficient test for trends in time series data.In particular, we suggest that we can characterise a given catalogue of earthquakes as being dominated by background, aftershock, or possibly AMR seismicity.To separate correlated aftershock sequences from a broader catalogue, one requires a method to systematically select likely sub-catalogues, which can then be tested by the methods described in this paper.In the case of our Parkfield example, knowledge of the location and geometry of the rupture was important for our anlysis.To first order, aftershock sequences of many past events can be isolated visually, by simply selecting spatially clustered earthquakes in the vicinity of the epicentre and excluding seismicity clustered around neighbouring large events; elliptical regions appear to be a reasonable first estimate.Preliminary record-breaking interval studies of the 1999 (M=7.1)Hector Mine event suggest that this simple approach is sufficient to produce convincing signal to noise in Eq. ( 12).Numerical two point correlation methods might also be employed, and of course more complex geometries, for example, ruptures on multiple faults, present greater challenges.It may be possible to develop a method by which one starts with a simple geometry and systematically adds and removes events in order to optimize a correlation metric and converge upon a complete aftershock catalogue.
Forecasting applications require more sophisticated, efficient methods to locate possible rupture epicentres and geometries.Clearly, our successful retrospective forecast of the 2004 Parkfield earthquake, as one might interpret Fig. 8 to suggest, benefits from advance knowledge of the rupture epicenter and geometry.Anticipating this geometry, however, is non-trivial and has so far proven an unresolved obstacle to AMR based forecasts.Calculating all rupture geometries at all locations of a large map is computationally impractical, and fault models are incomplete and unreliable indicators of the locations of future epicentres.It might be possible to use seismic rate-based hazard maps, for example, Relative Intensity (RI) or Pattern Infomatics (PI) (Holliday et al., 2006), to reduce the problem to one that is computationally feasible.Epicentres can be too constrained to "hot-spot" regions likely to experience seismicity; the size of the rupture region, where aftershocks and AMR are clustered, can be estimated as a function of mainshock magnitude.Presumably, the recordbreaking sequence n is also related to the magnitude of the event being forecast.Likely rupture geometries can be inferred by convolving along clusters or contours of the hazard map.Again, by adjusting the geometry of the region being analysed to optimize some metric, for example, Eq. ( 12), it may be possible to converge upon the precise parameters of likely rupture area geometries in advance of mainshocks.In both the forecasting and retrospective cases, the simplicity and computational efficiency of record-breaking methods may contribute to the computational feasibility of comprehensive solutions.

Fig. 1 .
Fig.1.Mean numbers of record-breaking intervals n rb and their standard deviations are given as a function of event number n (natural time).Results are given for the longest and shortest intervals from the CMT catalogue and the synthetic random catalogue.Also included is the theoretical prediction for an iid process from Eq. (4).

Fig. 2 .
Fig. 2. Mean lengths of record-breaking intervals t rb and their standard deviations are given as a function of event number n (natural time).Results are given for the longest and shortest intervals from the CMT catalogue and the synthetic random catalogue.

Fig. 4 .
Fig. 4. Numbers of record-breaking longest intervals, n rb , are given as a function of event number n (natural time).Results are given for the Parkfield aftershock sequence, starting 0.01 days after the main-shock, compared to a simulated non-homogeneous Poisson process (NHPP).The blue, green and red points represent data from the Parkfield aftershock sequence for m c =1.5,2.0,2.5, respectively.The solid lines and error bars of the same colour represent the mean and standard deviation, over 1000 simulations, from the corresponding NHPP.

Fig. 5 .
Fig. 5. Lengths of record-breaking longest intervals, t rb , are given as a function of event number n (natural time).Results are given for the Parkfield aftershock sequence, starting 0.01 days after the mainshock, compared to a simulated non-homogeneous Poisson process (NHPP).The blue, green and red points represent data from the Parkfield aftershock sequence for m c = 1.5,2.0,2.5, respectively; these data points are shifted 50 events to the right to further mitigate seismographic anomalies in the coda.The solid lines and error bars of the same colour represent the mean and standard deviation, over 1000 simulations, from the corresponding NHPP.

Fig. 6 .
Fig. 6.Mean number of record-breaking maximum temperatures n rb-max and record-breaking minimum temperatures n rb-min from MLO are given as a function of time measured forward from 1977 to 2006.Also included is the theoretical prediction for an iid process from Eq. (4).

Fig. 7 .
Fig. 7. Values of the ratio r(t), defined in Eq. (11), from MLO are given as a function of time measured forward from 1977 to 2006.