Earthquake source parameters that display the first digit phenomenon

We study the main parameters of earthquakes from the perspective of the first digit phenomenon: the nonuniform probability of the lower first digit different from 0 compared to the higher ones. We found that source parameters like coseismic slip distributions at the fault and coseismic inland displacements show first digit anomaly. We also found the tsunami runups measured after the earthquake to display the phenomenon. Other parameters found to obey first digit anomaly are related to the aftershocks: we show that seismic moment liberation and seismic waiting times also display an anomaly. We explain this finding by invoking a selforganized criticality framework. We demonstrate that critically organized automata show the first digit signature and we interpret this as a possible explanation of the behavior of the studied parameters of the Tohoku earthquake.


Introduction
With the advent of modern seismological and geodetical instrumentation, the study of the earthquake process has experienced great advances, much of it punctuated by the occurrence of giant earthquakes.Since 2004, three of these spectacular events have occurred, each producing large tsunamis, followed by human and material losses.These are the 2004 Northern Sumatra (Lay et al., 2005), the 2010 Central Chile (Vigny et al., 2011) and the 2011 Tohoku, Japan (Simons et al., 2011), earthquakes, each of them representing an opportunity to advance in the comprehension of geophysical phenomena.Three key elements in the understanding of these events are (a) the source process, a highly nonlinear and heterogeneous phenomenon regarding the initiation, growth and stopping of the earthquake itself, (b) the postseismic relaxation effects, which comprise the later perturbations at the crust and fault itself after the stop of the slip phase, and (c) the water wave produced by the sudden uplift of the ocean floor, its propagation through the ocean and the, often destructive, arrival inland.
Of the aforementioned events, the Tohoku earthquake is by far the best recorded, at the local and global level.The Japanese and worldwide effort, led by universities and public institutions, gathered a great bulk of information regarding this event, most of it public.We use data and models of this event to assess and establish a regularity of the source and later events known as the first digit anomaly (Benford, 1938).
The phenomenon consists in the nonuniform statistical distribution of the first digit different from 0 present in ausually large -population of data d = {d 1 , . . ., d n } coming from natural systems.The law states that the probability of finding the number 1 as the first digit different from 0 in d is higher than the probability of finding the number 2 and so on according to the formula first proposed in the 19th century (Newcomb, 1881) by noticing the wear accumulated in the first pages of logarithm tables relative to the last ones.
The first digit phenomenon has received considerable attention (Hill, 1998) and is known to be found in diverse areas like physics (Berger et al., 2005;Tolle et al., 2000), mathematics (Cohen and Katz, 1984), computation (Knuth, 1981), and the economy (Nigrini, 1996), and recently it has found Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.some application in geophysics (Sambridge et al., 2010;Nigrini and Miller, 2007;Geyer and Martí, 2012).From a statistical point of view, theorems providing conditions for a first digit anomaly to occur include scale invariant (Hill, 1995a) and random sampling (Hill, 1995b) properties and data sets possessing multidecadal range are known to display the anomaly too (Fewster, 2009).It has been known since the beginning of the 20th century that processes governed by geometric laws are equidistributed over the circle as long as an irrational base is considered, a property known as ergodicity of the geometric maps (Arnold, 2014), which itself implies the first digit anomaly.We interpret these facts as conditions a dynamical system must meet and we look for a general mechanism accomplishing them.We note that there is no simple explanation covering all aspects at play (see the Berger and Hill (2011) theorems).An in-depth review of these properties is outside of the scope of this work; for more information, we recommend the paper of Sambridge et al. (2010) and the book of Berger and Hill (2015).It has been argued by Tarantola (2006) that the effect appears related to the so-called Jeffreys pairs: physical variables endowed with the property of being as meaningful as their inverses.For instance: period and frequency, conductivity and resistivity (hydraulic, electric or thermal) or compliance and stiffness.We must note that Jeffreys pairs usually display large dynamical ranges, a fact evident because of the regular use of logarithm scales when working with these variables.From now on, we will work under the hypothesis that an underlying physical process connects these elements (random sampling, scale invariance, broad dynamical range) and could explain the ubiquitous presence of the first digit phenomenon.

The Tohoku earthquake from the point of view of the first digit
As we pointed out, the first digit phenomenon has recently found some applications in geophysics.Sambridge et al. (2010) demonstrated that the earthquake signal in a time series displays Benford's anomaly while the noise before does not.This effect was proposed as a seismic event trigger, because the noise was shown to display a Gaussian behavior very different from the first digit phenomenon.The practical consequences of this property are far reaching, because seismic localization algorithms and early warning methods depend on the ability to detect with precision the arrival time of seismic signals (Allen and Kanamori, 2003;Lancieri et al., 2011;Lomax et al., 2012).Our insight is that this property of the earthquake arrival is directly related to the seismic source and related processes.To go further into this track, we revisited some published data from the Tohoku earthquake in terms of the first digit distribution.
Leaving aside seismograms, we studied physical parameters closely related to the coseismic and postseismic processes.We choose to review data from the Tohoku earthquake mainly because of data quality.The data used come from direct measurements of earthquake effects and indirect estimations as well.Care was taken in regard to the statistical significance of the samples selected, as we left out interesting data with few samples.
In Table 1 we present first digit statistics of various parameters closely related to the seismic source process in the case of the Tohoku earthquake.For each parameter, a χ 2 goodness of fit test with 9 degrees of freedom is presented.Also, each test is accompanied by the 5 % significance level p; values of this last parameter close to one indicate statisti- cal agreement with the hypothesis of the empirical data following the first digit anomaly.First, we show the finite fault model, as regularly published by the US Geological Survey (Hayes, 2011a).This set is showed in Fig. 1; the data correspond to an inversion of P wave, SH wave and long-period surface waves of the source from globally located stations (Ji et al., 2002;Hayes, 2011b).We collected the 240 slips ( u) and seismic moments (M 0 ) that give form to the finite fault model of the earthquake.We found the slips to follow the first digit anomaly.Seismic moments present the anomaly as well, and this is expected because seismic moment is a affine scaling of slips at the fault.From Fig. 1 the high dynamical range of the data can be clearly seen, and as was mentioned this is one of the known characteristics of parameters showing the first digit anomaly.Second, we used a GPS inversion of the coseismic inland deformation.This inversion uses data from the GPS Earth Observation Network (GEONET, Ozawa et al., 2011) and it represents an ensemble of 357 points inverted; shown in Fig. 2 are the total displacement magnitudes, which describe the effect of slip distribution on the fault and the effects over the Earth's surface from geodetic data; observe the high dynamical range of displacements.The absolute value of the deformation |u c | shows a clear first digit anomaly as shown in Table 1.Third, from the same data set, we studied the first digit distribution of the postseismic relaxation process |u p | proposed by the authors.The data shown in Fig. 3 present the expected dynamical range for data that shows agreement with the expected probabilities.Fourth, Sambridge et al. (2011) showed that the waiting times between earthquakes presented a first digit anomaly.Also in Table 1, we show selected events of the aftershock series as recorded by the Global Centroid Moment Tensor (GCMT, Ekström et al., 2012) of the Tohoku earthquake.We collected data from 11 March 2011 until 31 January 2012, considering a restricted geographic location of the earthquake, to avoid sophisticated filtering of events.The aftershock series is composed of 172 events, located between 12 and 80 km depth and ranging from moment magnitude 4.9 to 9.1; a representation of the aftershock series can be seen in Fig. 4. The colors of the circles clearly show the high dynamical range reached by waiting times.From this set, the first digit distribution of the seismic moment released M 0 (t) is remarkable and the waiting times τ between aftershocks were found to obey a weak statistical significant first digit anomaly at the 5 % level.Fifth, regarding the tsunami phenomenon, we analyzed runup (r) data measured by Mori et al. (2011).This data set comprises 5260 points, each of them representing the maximum height inland reached by the water wave generated by the dislocation in the ocean floor.An image is presented in Fig. 5, where the different scale colors present in tsunami data can be appreciated.This data set also presents a first digit anomaly.
As a summary, parameters closely related to the source process display Benford's effect, and those parameters in- clude slip and moment distribution on the fault inverted from seismic data, surface deformation inverted from geodetic data, tsunami heights (possibly related to the source itself) surveyed directly and the GCMT aftershock series' moment release and waiting times.

A possible explanation of the ubiquity of Benford's law
As has been shown, the first digit anomaly appears in various variables regarding the process of seismic rupture.The earthquake, now viewed not just as the slip phase, contains this signature, and it seems natural to search for a unique mechanism, which could explain the anomaly.Indeed, a model capable of accounting for global features of earthquakes has already been proposed, and it is known as self-organized criticality, SOC (Bak and Tang, 1989;Ito and Matsuzaki, 1990;Sornette and Sornette, 1989).We will not try to demonstrate that SOC is the mechanism behind earthquakes, as there is a considerable debate about the relation between SOC and earthquakes (Ramos, 2010), but we will show that the paradigm of SOC, the two-dimensional sand pile cellular automaton (Bak et al., 1988), shows a remarkable first digit anomaly.
A SOC state is a special equilibrium reached by extended systems that are governed by nonlinear rules generally under dissipative conditions.This regimen is characterized by power laws and fractal geometries.The existence of various laws of this type in seismology, Gutenberg-Richer, Omori, Båth and lately aftershock density distance decay (Felzer and Brodsky, 2006), are the strongest evidence of some critical mechanism at work, although the exact conditions are still unknown.For a recent view of current research, see Pruessner (2012), and for a thorough exposition of the subject, see Christensen and Moloney (2005) and Jensen (1998).We tested two cellular automata, known to present very different behaviors: the one-dimensional sand pile (Bak et al., 1988) and the two-dimensional Bak-Tang-Wiesenfeld (BTW) automaton (Bak et al., 1987(Bak et al., , 1988)).The one-dimensional pile consists of an array of L integers subjected to random forcing.When a threshold is reached, the forced cell yields, transferring its burden to the next neighbor.Those rules are played asynchronously for a period of time T until meaningful statistics reveal the special equilibrium reached.This automaton does not present the properties of SOC since the correlation between cells is weak.Therefore the pile's global energy distribution (the number of consecutive transfers or avalanche) presents exponential decay.On the other hand, the BTW automaton is formed by a bi-dimensional grid of L × L points.Again the cells are submitted to random forcing, a threshold is set, and when a cell yields, it transfers its burden to four neighbors.On both automata the borders of the grid are the dissipative points (1/4 of the burden is lost in the two-dimensional case).After asynchronously playing of the rules, the BTW automaton reaches a state of dynamic equilibrium characterized by avalanches of all sizes.These simple rules give rise to a highly correlated state in time and space as well.The global energy distribution of the automaton is a self-similar power law.
In Table 2, we show the results of the one-dimensional sand pile.We present statistics of automata of different sizes, ranging from the small 11-point grid automaton to the bigger 301 one.It is shown that released energy E does not present a first digit anomaly.The explanation is simple: the pile does not reach whole size avalanches.The weak space correlation between cells produces events of size one or two, giving the digits 1 and 2 high frequencies.Regarding waiting times, the one-dimensional automaton shows a first digit anomaly just like the case of Tohoku data.We note that this parameter is not known to present a universal power law behavior, and  (Bak et al., 1988) it has been reported that short and long waiting times display different critical exponents (Davidsen and Goltz, 2004).This relation cannot be studied here, because our control parameter is the dimension of the pile; in other words, we are exploring differences in space rather than differences in time (we stress that space maps into a bounded segment of the real plane and time to the unbounded real line; both sets present very different geometrical features).
In Table 3 we show the BTW sand pile.Again the statistics are shown for automata of different sizes.The range of size is wider because of the higher dimension of the automata.The waiting times and the energetics of the automaton show a remarkable Benford effect.It should be noted that the lower size automaton presents a weak correlation effect; likewise, the one-dimensional sand pile.This is related to the finite size of the grid (Bak et al., 1987); the higher the automaton size, the better the first digit anomaly.
There are other models that are more akin to model seismicity.One of the more severe criticisms to the BTW model is the lack of aftershocks, a common and well-established property of earthquakes.However, Ito and Matsuzaki (1990) showed an automaton with minor changes in relation to the BTW model that display aftershocks, and is capable of reproducing Omori's law.There are even models with no stochastic mechanisms, like the Carlson-Langer model (Carlson and Langer, 1989) in the tradition of the wellknown Burridge-Knopoff model (Burridge and Knopoff, 1967).Moreover, there are automata with nonconservative rules like the Olami-Feder-Christensen (OFC) (Olami et al., 1992), all of them are believed to present self-organized critical equilibria.If they show Benford's effect, then they will be the subject of future studies.But, we believe that the first digit anomaly is a symptom.
Recent studies on OFC automata (Sarlis et al., 2011) revealed a striking similarity of the fluctuations of the order parameter with seismicity.Focusing on these fluctuations before the Tohoku earthquake, it has been found that they exhibit an unprecedented minimum almost 2 months before the main shock, i.e., the beginning of January 2011 (Sarlis et al., 2013(Sarlis et al., , 2015;;Varotsos et al., 2012Varotsos et al., , 2014)).These fluctuations could be mapped to the exponent in the Gutenberg-Richter law (see Fig. 6 and Sect.4); therefore, the first digit anomaly may be used as a bridge relating the dynamics of these phenomena.

Discussion
Concerning the actual relationship between earthquakes and SOC, we are bringing new information to light.What we have learnt is that if SOC is the underlying mechanism be-   (Bak et al., 1987(Bak et al., , 1988)).First digit statistics for various two-dimensional cellular automata of different sizes.hind the complexity of earthquakes, revealed in power laws, then its first digit imprint is translated into the main observables of seismicity like the energy, displacements and tsunami runups.That is the case of the earthquake source parameters presented.The aftershocks are an interesting matter as it is believed that the heterogeneous stress drop at the fault generates barriers, which at the end generate the complex patterns found in aftershock series, with Omori's law as one of the main characteristics (Aki, 1979).That the first digit phenomenon encounters stable parameters, like the released seismic moment, is a strong indication that SOC is at work not only on the generation process, but also on the later liberation of energy at the fault itself.How far could this mechanism be pushed?Actually, the spectral analysis at the core of the criticality (the so-called pink noise fingerprint) offers a very general explanation.In Fig. 6 we present a theoretical variable K that describes some parameter of a natural phenomenon at hand, it maybe dissipated energy or some other observable.The controlling parameter is the power law behavior with respect to the variable k, modeled as K ∼ k −ζ , with the exponent a real number.If we observe the first decade only, one may find the geometrical roots of the first digit anomaly, because the space between 1 and 2 (populated with numbers all starting with 1) is 30.1 % of the total decade, the space between 2 and 3 is 17.6 %, and so on.Therefore the uniform sampling of the process with respect to k implies the first digit anomaly in K, as long as the power law scaling is valid.More important is the repetitive nature of this process: what happens with the first decade happens all over the available range in K, or in mathematical terms we may map a process ranging various orders of magnitude to the behavior at the first decade; this implies a map from the real line into the circle, generally known as periodicity, this fact maybe connected with Poincaré's recurrence theorem (see Sect. 3 of Arnold (1989), and references therein), although we do not know the specific map for the case of earthquakes.The hypotheses of this theorem implies a first digit anomaly.The conditions imposed over this supposed system are very general; consequently, we expect this behavior to be common in nature, in concordance with the reported analysis of Sambridge et al. (2010).Is it possible to recover a specific SOC model from a first digit anomaly alone?At this stage we can not distinguish between them.As discussed, the scaling structure of a critical model is mapped into a periodic space where first digit statistics are calculated, so just with the anomaly it is not possible to retrieve the original SOC model.With respect to the studied parameters, we consider two kinds of data: observed and recorded.As long as the models or the instrumentation do not filter out the scaling of the phenomena, turning the power law into something else, we expect the first digit anomaly to be clearly recognized.How many features of the studied phenomena do we need to establish criticality?It is not clear to us if there is a specific number of data to collect or a fixed number of models to run, but we expect the spectral content to be the key; i.e., we need to preserve the power law scaling.

Conclusions
The first digit phenomenon has been taken for a simple mathematical property, but it has proven to be hard to elucidate the true origins of it (Berger and Hill, 2011).We have demonstrated that the phenomenon is not only present in the seismic source process, but it is also present in one of the most remarkable explanations of the earthquake phenomena.We claim that an imprint of the SOC mechanism could be traced back by way of Benford's effect, by the study of energy and space observables as those indirectly derived or measured in situ.
The main properties seem to be (1) the stochastic nature of the earthquake phenomena under study, (2) a scaleindependent mechanism, ranging in various orders of magnitude from short-period GPS source inversions to long-period seismic wave imaging, and (3) nonlinear laws of interaction powering the long-range correlations.

Figure 1 .
Figure 1.Finite fault model from NEIC(Hayes, 2011a).Colorbar slip magnitude in centimeters.Sizes of arrows proportional to slip; rake represented as the direction of the arrows.From the sizes of the arrows, the existence of displacements in a broad dynamical range is clear, covering at least 6 orders of magnitude.

Figure 4 .
Figure 4. Selected events of the aftershock series, from 11 March 2012 until 31 January 2012, from the GCMT database(Ekström et al., 2012).The dynamical range of waiting times is at least 4 orders of magnitude.

Figure 5 .
Figure 5. Runup data measured by Mori et al. (2011).Color bar in meters.Different scales present in runup data clearly evident from populations present in figure spanning at least 4 orders of magnitude.

Figure 6 .
Figure 6.Theoretical power law behavior of a natural system.

Table 1 .
Benford's law probabilities P D in conjunction with the first digit distribution of various parameters related to the source of the Tohoku earthquake and related phenomena.

Table 2 .
Sand pile one-dimensional cellular automaton . First digit statistics for various one-dimensional cellular automata of different sizes.

Table 3 .
BTW two-dimensional cellular automaton