Journal topic
Nonlin. Processes Geophys., 26, 371–380, 2019
https://doi.org/10.5194/npg-26-371-2019
Nonlin. Processes Geophys., 26, 371–380, 2019
https://doi.org/10.5194/npg-26-371-2019

Research article 23 Oct 2019

Research article | 23 Oct 2019

# Negentropy anomaly analysis of the borehole strain associated with the Ms 8.0 Wenchuan earthquake

Negentropy anomaly analysis of the borehole strain associated with the Ms 8.0 Wenchuan earthquake
Kaiguang Zhu1,2, Zining Yu1,2, Chengquan Chi1,2, Mengxuan Fan1,2, and Kaiyan Li1,2 Kaiguang Zhu et al.
• 1College of Instrumentation and Electrical Engineering, Jilin University, Changchun, China
• 2Key Laboratory of Geo-exploration Instrumentation, Ministry of Education, Jilin University, Changchun, China

Correspondence: Kaiguang Zhu (zhukaiguang@jlu.edu.cn)

Abstract

A large earthquake of 8.0 magnitude occurred on 12 May 2008, 14:28 UTC, with the epicentre in Wenchuan. To investigate the pre-earthquake anomalous strain changes, negentropy is introduced to borehole strain data for three locations, approximated by skewness and kurtosis, revealing the non-Gaussianity of recorded fluctuations. We separate the negentropy anomalies from the background by Otsu's method and accumulate the anomaly frequency on different scales. The results show that the long-term cumulative frequency of negentropy anomalies follows a sigmoid behaviour, while the inflection point of the fitting curve is close to the occurrence of the earthquake. For the short-term analysis before the earthquake, there are two cumulative acceleration phases. To further verify the correlation with the earthquake, we compare our findings for different time periods and stations and rule out the possible influence of meteorological factors. We consider the negentropy analysis to exhibit potential for studying pre-earthquake anomalies.

1 Introduction

Changes in crustal deformation fields over time have been recorded at least for some large earthquakes , such as the 2011 Tōhoku earthquake (Hirose2011) and the Ruisui earthquake in Taiwan in 2013 . Borehole strainmeters which detect the crustal changes provide an opportunity to investigate the preparation process prior to earthquakes. Many strain observations were of significance to research , although there were some unsuccessful detections, such as those of the 1987 Superstition Hills earthquake , 1989 Loma Prieta earthquake and 2009 L'Aquila earthquake . Various methods are used in identifying borehole strain anomalies based on a large amount of monitoring data. Experienced scholars extract borehole strain anomalies by discriminating patterns of waveform behaviours compared to those during the normal stage . In the time domain, identified abnormal strain changes by the overrun rate and wavelet decomposition for the Wenchuan earthquake. While in the frequency domain, thought that the signal with a period of 10 to 60 min might be anomalies through S transform compared with the background signal. In addition, statistical methods proved to be effective in distinguishing borehole strain anomalies with regard to large earthquakes, such as principal component analysis and correlation coefficients along with the consistency relation .

The probability density function (PDF) of observation data is also an informative way of extracting potential anomalies contained in earthquake generation processes. extracted variance anomalies of the probability density of the Earth's vertical velocity increments and successfully found a pronounced transition from Gaussian to non-Gaussian prior to 12 moderate and large earthquakes. Before the Wenchuan earthquake, the high-frequency fluid observational data deviated from Gaussian distributions at 16 water level and 14 water temperature stations .

Rather than the whole PDF, its moments are often utilized; moments may be estimated quite reliably from relatively small amounts of data . In 2016, Chen (2016) applied skewness and kurtosis (the third- and fourth-order moments) of the geoelectric data to pick up non-Gaussian distribution anomalies to predict impending large earthquakes in Taiwan. On the other hand, for turbulent or disordered systems, the non-Gaussian distribution of time series in the skewness–kurtosis domain attracts attention. Observation data series from various fields of geophysics indicate that a parabolic relation between skewness and kurtosis holds in fields such as seismology , oceanography and atmospheric science (Maurizi2006).

Thereby, it is possible that precursor anomalies lead to an increase in disordered components in observation data. proved that the pre-catastrophic stage could break the persistency and high organization of the electromagnetic field through studying the fractional-Brownian-motion-type model using laboratory and field experimental electromagnetic data. In view of Lévy flight and Gaussian processes, the Lévy flight mechanism prevents the organization of the critical state being completed before earthquakes, since the long scales are cut off due to the Gaussian mechanism .

Entropy can serve as a measure of the unknown external energy flow into the seismic system (Akopian2015). quantified and visualized temporal changes of the complexity by approximate entropy; they claimed that significant complexity decrease and accession at the tail of the preseismic electromagnetic emission could be diagnostic tools for the impending earthquake. Approximate entropy has also been studied in catastrophic events . detected earthquake activation precursors by studying the regional seismic information entropy on the earthquake catalogue.

Negentropy's definition is based on the entropy, and it is also widely used to detect non-Gaussian features. proposed a method for picking arrival time based on negentropy for microseismic data. In this study, the negentropy is applied to the borehole strain associated with the Wenchuan earthquake at the Guza station, approximated by skewness and kurtosis, which reveal the non-Gaussianity of borehole fluctuations. Subsequently we study the extracted negentropy anomalies on different scales to investigate correlations with crustal deformation. Furthermore, we did a comparison discussion for different time periods and stations.

2 Observation

YRY-4 borehole strainmeters, which are designed to record continuous deformation occurring over periods of minutes to years, have been deployed at depths of more than 40 m at more than 40 terrain-sensitive locations within China. These strainmeters are capable of resolving strain changes of less than one-billionth. The data sampling rate is once per minute.

The study period is from 1 January 2007 to 30 June 2009. The study area is shown in Fig. 1. We find that the Guza station stands on the southwestern end of the Longmenshan Fault zone. Besides this, the epicentre is about 150 km away from the station, which is within the monitoring capability of the borehole strainmeters (Su1991).

Figure 1Location map showing the epicentre and three stations. The epicentre was located at 31.01 N, 103.42 E, shown by yellow star. The blue triangles show the strainmeter stations. The red circles are aftershock distribution from the main shock on 12 May to 30 October 2008. The green curve is the schematic curve of the main rupture zone, and black curves are faults.

Because the four gauges of the YRY-4 borehole strainmeter are arranged at 45 intervals, this design has improved its self-consistency. This arrangement produces four observation values: Si, ($i=\mathrm{1},\mathrm{2},\mathrm{3},\mathrm{4}$) . The self-consistency is shown in Eq. (1), which can be used to test the reliability of the data among the four gauges:

$\begin{array}{}\text{(1)}& {S}_{\mathrm{1}}+{S}_{\mathrm{3}}={S}_{\mathrm{2}}+{S}_{\mathrm{4}}.\end{array}$

In practical application, the higher the correlation between both sides of the Eq. (1), the more reliable the data. In that case, we can use areal-strain Sa for describing the subsurface strain state instead of four component observations. Sa is expressed as

$\begin{array}{}\text{(2)}& {S}_{\mathrm{a}}=\left({S}_{\mathrm{1}}+{S}_{\mathrm{2}}+{S}_{\mathrm{3}}+{S}_{\mathrm{4}}\right)/\mathrm{2}.\end{array}$

The borehole strain of Guza station is highly consistent , as shown in Fig. 2. Then we processed the daily areal strain through two steps.

Figure 2Self-consistency of the borehole strain at Guza from 1 January 2007 to 30 June 2009.

Step 1: differential calculation. We set the areal-strain data as X(n) and differential areal-strain data as Y(n); we know that $Y\left(n\right)=X\left(n\right)-X\left(n-\mathrm{1}\right)$, where n is the sample point. The process can be equivalent to a filtering system; H1(ejω) is frequency responses of Step 1,

$\begin{array}{}\text{(3)}& |{H}_{\mathrm{1}}\left({e}^{j\mathit{\omega }}\right)|=\sqrt{\mathrm{2}\left(\mathrm{1}-\mathrm{cos}\mathit{\omega }}.\end{array}$

It can be seen that when ω is very small or 0, the frequency response is 0, indicating that Step 1 removes the low-frequency information of the signal, including the borehole trend and low-frequency effects of the air pressure and temperature on the signal.

Step 2: harmonic analysis. We remove the periodic term that still exists through daily harmonic analysis. We set the fitting function as Fourier series. The reserved signal Z(n) can be simplified as $Z\left(n\right)=Y\left(n\right)-{\sum }_{k=\mathrm{1}}^{n}A\left(k\right)\mathrm{sin}\left(k{\mathit{\omega }}_{\mathrm{0}}n+{\mathit{\phi }}_{i}\right)$, where kω0 and φi are frequencies and phases of the periods of the daily data, and A(k) is corresponding amplitudes.

H2(ejω) are frequency responses of Step 2 by minimizing Z(n) through the least-squares method in time domain; then ideally, for the frequency response,

$\begin{array}{}\text{(4)}& |{H}_{\mathrm{2}}\left({e}^{j\mathit{\omega }}\right)|=\left\{\begin{array}{ll}\mathrm{0}& \mathit{\omega }=k{\mathit{\omega }}_{\mathrm{0}},\\ \mathrm{1}& \mathit{\omega }=\mathrm{others}.\end{array}\right\\end{array}$

Step 2 removes the periodic terms in the signal. We think that the period terms mainly include the periods related to the solid tide and also include the periodic effects of air pressure. The residual high-frequency signals are shown in Fig. 3. In particular, small changes in the curve are amplified by the processing.

Figure 3High-frequency areal strain at Guza from 1 January 2007 to 30 June 2009.

3 Methodology

## 3.1 Negentropy and non-Gaussianity

The entropy-based negentropy is a statistically justified measure of non-Gaussianity . The entropy of a random variable $X=\mathit{\left\{}{x}_{\mathrm{1}},{x}_{\mathrm{2}},\mathrm{\dots },{x}_{i},\mathrm{\dots }\mathit{\right\}}$ is defined as

$\begin{array}{}\text{(5)}& H\left(X\right)=-\sum _{i}P\left(X={x}_{i}\right)\mathrm{log}P\left(X={x}_{i}\right),\end{array}$

where P is the probability density function. Entropy measures the randomness of a random variable. The Gaussian random variable has the largest entropy of all other random variables with equal variance . The definition of negentropy is given by

$\begin{array}{}\text{(6)}& J\left(X\right)=H\left({X}_{\mathrm{gauss}}\right)-H\left(X\right),\end{array}$

in which Xgauss is a Gaussian random variable with the same mean and covariance matrix as X. The entropy of a Gaussian random variable can be estimated by

$\begin{array}{}\text{(7)}& {X}_{\mathrm{gauss}}=\frac{\mathrm{1}}{\mathrm{2}}\mathrm{log}\mathrm{|}det\mathrm{\Sigma }\mathrm{|}+\frac{n}{\mathrm{2}}\left(\mathrm{1}+\mathrm{log}\mathrm{2}\mathit{\pi }\right),\end{array}$

where n is the dimension of the variable, and Σ is its covariance matrix.

However, the theoretical calculation of negentropy also depends on the prior probability density of random variables and other information which is difficult to determine accurately. In practical applications, higher-order statistics (HOS) and density polynomial expansion are usually used to approximate one-dimensional negentropy . The approximation results are as follows:

$\begin{array}{}\text{(8)}& J\left(X\right)\approx \frac{\mathrm{1}}{\mathrm{12}}{\mathrm{skewness}}^{\mathrm{2}}\left(X\right)+\frac{\mathrm{1}}{\mathrm{48}}{\mathrm{kurtosis}}^{\mathrm{2}}\left(X\right).\end{array}$

This definition suggests that any deviation from a Gaussian distribution will increase the negentropy J(x). The skewness and kurtosis are the third- and fourth-order statistics, respectively, which are defined as

$\begin{array}{}\text{(9)}& \mathrm{skewness}\left(X\right)=\frac{{\mathit{\mu }}_{\mathrm{3}}}{{\mathit{\sigma }}^{\mathrm{3}}}=\frac{E\left[\left(X-\mathit{\mu }{\right)}^{\mathrm{3}}\right]}{E{\left[\left(X-\mathit{\mu }{\right)}^{\mathrm{2}}\right]}^{\mathrm{3}/\mathrm{2}}},\end{array}$

and

$\begin{array}{}\text{(10)}& \mathrm{kurtosis}\left(X\right)=\frac{{\mathit{\mu }}_{\mathrm{4}}}{{\mathit{\sigma }}^{\mathrm{4}}}=\frac{E\left[\left(X-\mathit{\mu }{\right)}^{\mathrm{4}}\right]}{E{\left[\left(X-\mathit{\mu }{\right)}^{\mathrm{2}}\right]}^{\mathrm{2}}}-\mathrm{3},\end{array}$

where μ is the mean of X and σ is the standard deviation of X. Skewness is a measure of asymmetry in a PDF. A symmetric distribution has zero skewness. Kurtosis is a measure of the heaviness of the tails. Distributions that are more outlier-prone than the normal distribution have kurtosis values greater than zero.

Moreover, the relation between the skewness and kurtosis is universal, and they approximately align along a quadratic curve (Sattin et al., 2009):

$\begin{array}{}\text{(11)}& \mathrm{kurtosis}\left(X\right)=A\cdot {\mathrm{skewness}}^{\mathrm{2}}\left(X\right)+B.\end{array}$

Here we calculate the normalized skewness and kurtosis in the study period, so Eq. (9) can be derived into

$\begin{array}{}\text{(12)}& \mathrm{kurtosis}\left(X\right)=A\cdot \left({\mathrm{skewness}}^{\mathrm{2}}\left(X\right)-\mathrm{1}\right),\end{array}$

indicating that the test day is super-Gaussian when the skewness is outside the range (−1, 1).

This relation is trivial in a Gaussian fluctuating system; it reduces to a fixed mass around zero (skewness=0 and kurtosis=0). In a turbulent environment where fluctuating quantities obey non-Gaussian statistics, the moments obey the above relation.

## 3.2 Otsu's thresholding method

To solve the negentropy anomaly detection problem, we designed a simple thresholding hypothesis test using the Otsu method (Otsu1979) that provides an optimal separation between background and seismic-related activities. For any given value k, we can separate the previously calculated J(x), as shown in Eq. (6), into the following two classes:

$\begin{array}{}\text{(13)}& \begin{array}{rl}& {C}_{\mathrm{0}}\left(k\right)=\left\{J\left(x\right)\le k\right\},\\ & {C}_{\mathrm{1}}\left(k\right)=\left\{J\left(x\right)>k\right\}.\end{array}\end{array}$

Using these classes, the weighted average value μT(x) of J(x) can be expressed as follows:

$\begin{array}{}\text{(14)}& \begin{array}{rl}& {\mathit{\mu }}_{T}\left(x\right)={\mathit{\lambda }}_{\mathrm{0}}\left(k\right){\mathit{\mu }}_{\mathrm{0}}\left(x;k\right)+{\mathit{\lambda }}_{\mathrm{1}}\left(k\right){\mathit{\mu }}_{\mathrm{1}}\left(x;k\right),\\ & {\mathit{\lambda }}_{\mathrm{0}}\left(k\right)+{\mathit{\lambda }}_{\mathrm{1}}\left(k\right)=\mathrm{1},\end{array}\end{array}$

where μ0(x;k) and μ1(x;k) are the mean values of the class Ci(k), $i=\mathrm{0},\mathrm{1}$, and λi(k) is the percentage of points belonging into each class. Following the thresholding scheme of Otsu (1979), we define the following cost function:

$\begin{array}{}\text{(15)}& \begin{array}{rl}{{\mathit{\sigma }}_{B}}^{\mathrm{2}}& ={\mathit{\lambda }}_{\mathrm{0}}\left(k\right){\left({\mathit{\mu }}_{\mathrm{0}}\left(x;k\right)-{\mathit{\mu }}_{T}\left(x;k\right)\right)}^{\mathrm{2}}+{\mathit{\lambda }}_{\mathrm{1}}\left(k\right)\left({\mathit{\mu }}_{\mathrm{1}}\left(x;k\right)\right\\ & {-{\mathit{\mu }}_{T}\left(x;k\right))}^{\mathrm{2}},\\ & ={\mathit{\lambda }}_{\mathrm{0}}\left(k\right){\mathit{\lambda }}_{\mathrm{1}}\left(k\right){\left({\mathit{\mu }}_{\mathrm{1}}\left(x;k\right)-{\mathit{\mu }}_{\mathrm{0}}\left(x;k\right)\right)}^{\mathrm{2}},\end{array}\end{array}$

where ${{\mathit{\sigma }}_{B}}^{\mathrm{2}}$ is the within-class variance of negentropy. Then, by finding the k* value searching for k when ${{\mathit{\sigma }}_{B}}^{\mathrm{2}}$ becomes the maximum,

$\begin{array}{}\text{(16)}& {k}^{*}=\mathrm{arg}\underset{k}{max}{{\mathit{\sigma }}_{B}}^{\mathrm{2}}\left(k\right),\end{array}$

where the optimal value k* here separates the background set and anomaly set.

In this test, our initial assumption is that the sliding window is composed of a Gaussian signal of non-seismic-related activities. When our test negentropy exceeds the critical value k*, this initial hypothesis is not valid, and the alternative is true, indicating the presence of a negentropy anomaly within the window.

4 Results

According to the empirical hypothesis that geophysical signals deviate from the Gaussian distribution when they record abnormal activities, and based on the results of previous studies, we perform the following investigation.

## 4.1 Extracting negentropy anomalies

As the negentropy is calculated using a 2 h sliding window, we assume that it reaches the maximum values when the time window contains anomalies from seismic-related activities. The negentropy during the study period is shown in Fig. 4.

Figure 4Negentropy at Guza from 1 January 2007 to 30 June 2009. The red dotted horizontal line is the optimal threshold k* calculated by Otsu method.

The within-class variance ${{\mathit{\sigma }}_{B}}^{\mathrm{2}}$ and negentropy value distribution are compared in Fig. 5. According to Eqs. (11) to (14), when ${k}^{*}=\mathrm{1.1130}$, ${{\mathit{\sigma }}_{B}}^{\mathrm{2}}$ reaches its maximum. Therefore, the negentropy was separated by k* into the quasi-Gaussian background and non-Gaussian anomalies from 2007 to 2009. The Otsu threshold k* here is consistent with the accuracy of the negentropy and the strain data. The YRY-4 borehole strainmeter has a measurement accuracy of 10−9; however, we usually cut off four digits after the decimal point in practical calculations.

Figure 5Within-class variance ${{\mathit{\sigma }}_{B}}^{\mathrm{2}}$ in the negentropy (red line) and negentropy histogram.

In the skewness–kurtosis domain, the statistical relationship of the borehole areal strain is consistent with parabolic behaviour as described in Eq. (10) (Fig. 6a), verifying that the turbulent system of borehole strain is significantly non-Gaussian. Besides this, the extracted negentropy anomalies are clustered strongly on the left side of the parabola, which exhibits similar characteristics different from the normal Gaussian distribution. Here, there are four points on the right side; one occurred in early 2007, and the others occurred after the earthquake. Therefore, we will not discuss them in the following.

Figure 6Negentropy distributions in the skewness–kurtosis domain in (a) 1 January 2007 to 30 June 2009, and (b) four shorter periods before the earthquake. Red denotes the negentropy being greater than k*, and grey indicates the negentropy being less than k*. The blue curve is the quadratic fit with 95 % confidence.

In addition, as shown in Fig. 6b, at times far from the earthquake, the negentropy distribution is basically Gaussian in the skewness–kurtosis domain. However, at times closer to the earthquake, the relatively stable state was broken due to the non-Gaussian mechanism, with more negentropy anomalies appearing on the left side of the parabola, while in 2008, almost all of the negentropy present was skewed to the left.

The phenomena prompt us to study its possible correspondence with the seismogenic process.

## 4.2 Negentropy anomaly frequency accumulation

The transition of negentropy anomalies in the skewness–kurtosis domain is quantified as the change of the anomaly frequency per unit time through a logarithmic-linear model. Logarithmic-linear models of interest are often used to estimate the expected frequency of the response variable at the original scale for a new set of covariate values, such as the Gutenberg–Richer law, in which a linear relationship exists between the logarithm of the cumulative number of seismic events of magnitude M or greater versus the magnitude M .

The logarithmic-linear regression model is proposed as

$\begin{array}{}\text{(17)}& \mathrm{log}N={\mathit{\beta }}_{\mathrm{1}}×{k}_{J}+{\mathit{\beta }}_{\mathrm{0}}+\mathit{\epsilon },\end{array}$

where kJ takes different threshold values according to the J(x) values, N is the number of occurrences in which J is greater than or equal to the threshold kJ, β1 and β0 are the regression coefficients, where a lower slope β1 indicates that there are more higher J values, implying that there are more anomalies at that moment, and ε is the random error that represents the model uncertainty.

We use the logarithmic-linear model to solve the relationship between the negentropy anomaly frequency and different thresholds each day using the ordinary least-squares (OLS) method. Afterwards, an optimal threshold k*, calculated by the Otsu method, is chosen for all models, where

$\begin{array}{}\text{(18)}& {N}_{J}\left(t\right)=\mathrm{exp}\left({\mathit{\beta }}_{\mathrm{1}}\left(t\right)×{k}^{*}+{\mathit{\beta }}_{\mathrm{0}}\left(t\right)+\mathit{\epsilon }\left(t\right)\right),\end{array}$

and the NJ(t) under the threshold k* is shown in Fig. 7. The model theoretically solves the problem of selecting the length of the time window. In addition, the estimated NJ(t) is considered to be the expected frequency of anomalies.

The goodness of fit for each logarithmic-linear model was evaluated using analysis of

$\begin{array}{}\text{(19)}& {R}^{\mathrm{2}}=\mathrm{1}-\sum _{i=\mathrm{1}}^{n}{\left({N}_{i}-\stackrel{\mathrm{^}}{{N}_{i}}\right)}^{\mathrm{2}}/\sum _{i=\mathrm{1}}^{n}{\left({N}_{i}-\stackrel{\mathrm{‾}}{{N}_{i}}\right)}^{\mathrm{2}},\end{array}$

and the root-mean-squared error (RMSE):

$\begin{array}{}\text{(20)}& \mathrm{RMSE}=\sqrt{\sum _{i=\mathrm{1}}^{n}{\left({N}_{i}-\stackrel{\mathrm{^}}{{N}_{i}}\right)}^{\mathrm{2}}/n}.\end{array}$

The R2 and RMSE values in the study period (912 d) show that the logarithmic-linear relationship can explain the relationship between the negentropy anomaly frequency and different thresholds. The mean of R2 is 0.9695, which is close to 1, and the variance of R2 is 0.0435. The mean and variance of the RMSE are also small (0.1098 and 0.1301, respectively).

Figure 7Estimated expected frequency NJ under the optimal threshold k*.

We calculate the negentropy cumulative frequency of the study period, as shown in Fig. 8. There is not only a long-term analysis of the whole period, but also a short-term analysis of the pre-earthquake process. In general, accumulated value of a typical random process usually has a linear increase. In particular, in case of critical phenomena, we would expect more frequent anomalies when they approach the critical point and less frequent anomalies after .

For the entire earthquake process, a 2-month sliding window is selected for accumulation. In Fig. 8a, after July 2007, the negentropy anomalies gradually accumulated. Qiu (2009) and Chi (2014) also observed anomalies of this period at the Guza station, and they speculated that abnormal strain may reflect small-scale rock formation rupture before the earthquake. In particular, we find more frequent negentropy anomalies in 2008 as the earthquake approaches and less frequent anomalies after the earthquake; thus a sigmoid function is used to fit the acceleration before the earthquake and to fit the deceleration after the earthquake. The sigmoid function is expressed as

$\begin{array}{}\text{(21)}& y=A\mathrm{2}+\frac{\left(A\mathrm{1}-A\mathrm{2}\right)}{\left(\mathrm{1}+{e}^{\frac{x-{x}_{\mathrm{0}}}{\mathrm{d}x}}\right)},\end{array}$

where A1, A2, x0 and dx are the inflection point. The sigmoid function is a power-law temporal behaviour, with an upper concavity, and a subsequent power-law behaviour after the inflection point, with an opposite concavity. The inflection point in this function is a reasonable estimation of the time of the significant change in the critical dynamical system (Santis et al., 2017). Also, the value of x0 (8.3337) obtained in the fitting almost coincides with the day of the Wenchuan earthquake. The actual time of the earthquake is the 8.3871th bimonth from January 2007 after conversion.

Figure 8(a) Results of the long-term negentropy anomaly frequency for the Wenchuan earthquake at Guza station from 1 January 2007 to 30 June 2009. Each circle represents an anomaly negentropy for 2 months. The cumulative frequency of negentropy anomaly is represented. The earthquake day is represented as a black vertical dashed line. The red line is a sigmoid fit that underlines an inflection point (vertical solid line) that is close to the occurrence of the earthquake. (b) Results of the short-term negentropy anomaly frequency prior to the earthquake. Every grey point is an anomaly for 1 d, and the blue and red lines are two segment sigmoid fit results. Two green lines represent the liner regression for the two phases; the first phase slope is 0.00147, and the second one is 0.0021.

When we narrowed the accumulated window to 1 d, we observed two negentropy anomalies before the earthquake, as shown in Fig. 8b. The first anomaly frequency increase occurred from August to October 2007. In March 2008, there was a second phase of anomaly increase, and the cumulative frequency then slowly increased to a plateau period near the time of the earthquake. This, probably due to the stress, is in a deadlocked phase. This is because before the Wenchuan earthquake, the elastic deformation of the crust reaches its limit and the deformation is resisted in the hypocentral region, which is measured by GPS data .

These two phases prior to the earthquake are also approximated with sigmoid functions. In order to further compare the anomalies of the two phases, we use linear regression to fit the central part of the two sigmoid curves. We find that the second acceleration is greater than the first acceleration.

Fault zones contain relatively weak and relatively strong parts. The former is the area where strain release begins, while the latter is the stress-locking part and the beginning of rapid instability . proposed that there is a sub-instability stage of fault deformation before earthquakes, which is manifested by two instability activities. The former is related to the release of weak parts, and the latter is related to the rapid release of strong parts during strong earthquakes. They thought that acceleration of the strain release in fault zone is a sign of entering the inevitable earthquake stage. Thus, we speculate that the two accelerations of the cumulative negentropy anomaly in Fig. 8b may be related to the strong earthquake.

5 Comparison discussion

## 5.1 Comparison of random time periods.

We randomly selected the strain data for 200 d before and after 20 March 2011 and 24 March 2014 at the Guza station. The selected data for the two periods are required to be in the absence of strong earthquakes and have higher quality. We performed negentropy analysis on these two observations and compared them with the results of negentropy analysis associated with the Wenchuan earthquake, as shown in Fig. 9.

Figure 9The comparative analysis of cumulative frequency of negentropy anomalies between earthquake period and random time periods. The zero point of green dots is 20 March 2011, and the zero point of blue dots is 24 March 2014.

As we can see in Fig. 9, the cumulative frequency of negentropy anomalies of random periods has a linear increase. However, in the Wenchuan-earthquake periods, as the earthquake approaches, the cumulative frequency of negentropy anomalies increases rapidly and recovered to a slow growth after the earthquake.

## 5.2 Comparison of different stations

We selected the Xiaomiao station and Renhe station to find out if their observations received strain changes. Their locations are shown in Fig. 1. Compared with the Guza station, we did the negentropy analysis of these two stations, as shown in Fig. 10.

Figure 10Cumulative frequency of negentropy anomalies of Xiaomiao station and Renhe station from 16 September 2007 to 30 June 2009. The negentropy analysis of Guza station is from 1 January 2007 to 30 June 2009 because of the different installation time of the instruments. The red vertical line is the inflection point of the fitting curve of Guza station. The blue vertical line is the inflection point of the fitting curve of Xiaomiao station. The black dotted line is the earthquake day.

As we can see in the Fig. 10, the cumulative frequency of negentropy anomalies of the Xiaomiao station is also well fitted by the sigmoid function. The accumulation curve grows rapidly before the earthquake and is concave downward after the earthquake, which is similar to the Guza station, although the inflection point of the Xiaomiao station precedes the earthquake moment by about 2 months. However, since the curve is approximately linear before and after the inflection point, we consider that the inflection point value is reasonable in the range from January to June 2008. Cumulative anomalies of the Renhe station are basically linear, indicating that the Renhe station may not detect pre-earthquake anomalies.

The Renhe station is far from the end of the Wenchuan-earthquake fault, so it is reasonable that no abnormal changes are observed. However, the Xiaomiao station is located between the Guza station and Renhe station, and the fitting result shows that there is a similar trend to the Guza station, with a weaker curvature. So, for the nearest station to the epicentre, the Guza station may be able to record more pre-earthquake anomalies.

Furthermore, found that the anomalies at the Ningshan station were similar to the anomalies at the Guza station. These two stations have observed similar Wenchuan-earthquake precursor anomalies, which may not be accidental, since the Ningshan station is actually located at the northeastern end of the Longmenshan Fault zone. This location corresponds with the southwestern end of the fault, where the Guza station is located.

## 5.3 Exclusion of meteorological factors.

The strain signals are sensitive to a few meteorological factors; therefore, we display the pressure variations, temperature variations recorded at the Guza station and the daily rainfall measured by Tropical Rainfall Measuring Mission (TRMM) satellite, which can be downloaded through NASA Giovanni-4 for the same period and the same area (http://giovanni.gsfc.nasa.gov/giovanni/, last access: 18 October 2019) in Fig. 11. There are clearly annual variations in the strain, air pressure, temperature and rainfall data. The air pressure and temperature have been steadily fluctuating within a certain range, and the rainfall has also proved to be more in summer and less in winter.

Then we calculated the differential data of the strain for negentropy analysis. We also make differential calculations for all three influencing factors, as shown in Fig. 12.

Figure 11Borehole stain, air pressure, temperature and rainfall variations during study period at Guza station.

Figure 12Differential borehole stain, air pressure, temperature and rainfall variations during study period at Guza station.

We observed that the air pressure, temperature and rainfall did not change abnormally during the period when the extracted anomalies increase whether we do differential calculation or not. Therefore, we consider that the abnormal variations on the processed strain signals are not caused by these factors.

6 Conclusions

In our work, the extracted negentropy anomalies of borehole strain associated with the Wenchuan earthquake are analysed. The cumulative frequency of negentropy anomalies are studied in both the long and short term. In the Comparison discussion section, we compare the cumulative anomalies of different time periods and different stations with those at the Guza station during the study period and preliminarily exclude meteorological factors. We suspect that the negentropy anomalies at the Guza station may have recorded abnormal changes related to the Wenchuan earthquake.

Since the tectonic dynamics of earthquakes during seismogenic and seismic processes are very complex, the mechanism of such abnormal changes undoubtedly needs to be discussed. In particular, borehole strain signals are sensitive to external influence. Besides this, because of the different characteristics and accuracy of different types of observations, a joint analysis has not been carried out yet. Further research is needed to decipher a potential precursory phase. However, we may be able to ensure that the negentropy analysis has great potential in the study of earthquake precursors.

Data availability
Data availability.

The borehole strain data, air pressure and temperature data are confidential information and therefore cannot be made publicly accessible. The rainfall data are downloaded through NASA Giovanni-4 (http://giovanni.gsfc.nasa.gov/giovanni/, last access: 18 October 2019).

Author contributions
Author contributions.

The authors contributed in accordance with their competence in the research subject. The first author, KZ, was responsible for the key technical guidance and ideas. ZY was responsible for method improvement, data analysis and paper preparation. CC helped to ensure the graph quality of the paper. MF and KL contributed through active participation in the paper preparation.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Acknowledgements
Acknowledgements.

The authors would like to thank the China Earthquake Networks Center for providing the borehole strain data and the NASA Giovanni team for rainfall data. Moreover, the authors are grateful to Zehua Qiu for his guidance and helpful suggestions.

Financial support
Financial support.

This research was supported by the National Natural Science Fund (grant no. 41974084) and the Institute of Crustal Dynamics, China Earthquake Administration (grant no. 3R216N620537).

Review statement
Review statement.

This paper was edited by Luciano Telesca and reviewed by two anonymous referees.

References

Agnew, D. C. and Wyatt, F. K.: The 1987 Superstition Hills earthquake sequence: strains and tilts at Pinon Flat Observatory, B. Seismol. Soc. Am., 79, 480–492, 1989. a

Akopian, S. T.: Open dissipative seismic systems and ensembles of strong earthquakes: energy balance and entropy funnels, Geophys. J. Int., 201, 1618–1641, 2015. a

Amoruso, A. and Crescentini, L.: Limits on earthquake nucleation and other pre-seismic phenomena from continuous strain in the near field of the 2009 L'Aquila earthquake, Geophys. Res. Lett., 37, L10307, https://doi.org/10.1029/2010GL043308, 2010. a

Canitano, A., Hsu, Y. J., Lee, H. M., Linde, A. T., and Sacks, S.: Near-field strain observations of the October 2013 Ruisui, Taiwan, earthquake: source parameters and limits of very short-term strain detection, Earth Planets Space, 67, 125, https://doi.org/10.1186/s40623-015-0284-1, 2015. a

Chen, H. J. E. A.: Testing the correlations between anomalies of statistical indexes of the geoelectric system and earthquakes, Nat. Hazards, 84, 877–895, 2016. a

Chi, S., Jing, Z., and Yi, C.: Failure of self-consistent strain data before Wenchuan,Ludian and Kangding earthquakes and its relation with earthquake nucleation, Recent Developments in World Seismology, 12, 3–13, 2014. a

Cover, T. and Thomas, J.: Elements of Information Theory, Wiley, New York, 1991. a

Cristelli, M.: Universal Relation Between Skewness and Kurtosis in Complex Dynamics, Phys. Rev. E, 85, 066108, https://doi.org/10.1103/PhysRevE.85.066108, 2012. a

Eftaxias, K., Contoyiannis, Y., Balasis, G., Karamanos, K., Kopanas, J., Antonopoulos, G., Koulouras, G., and Nomicos, C.: Evidence of fractional-Brownian-motion-type asperity model for earthquake generation in candidate pre-seismic electromagnetic emissions, Nat. Hazards Earth Syst. Sci., 8, 657–669, https://doi.org/10.5194/nhess-8-657-2008, 2008. a

Gutenberg, B. and Richter, C. F.: Seismicity of The Earth and Associated Phenomena, Princeton University Press, Princeton, 1954. a

Hirose, H.: Tilt records prior to the 2011 off the Pacific coast of Tohoku Earthquake, Earth Planets Space, 63, 655–658, 2011. a

Hyvarinen, A. and Oja, E.: Independent component analysis: algorithms and applications, Neural Networks, 13, 411–430, 2000. a

Jiang, Z. S., Fang, Y., Yan-Qiang, W. U., Wang, M., and Fang, D. U.: The dynamic process of regional crustal movement and deformation before Wenchuan MS 8.0 earthquake, Chinese J. Geophys.-Ch., 52, 505–518, 2009. a

Johnson, M. J. S., Linde, A. T., and Gladwin, M. T.: Near-field high resolution strain measurements prior to the October 18, 1989, Loma Prieta Ms 7.1 Earthquake, Geophys. Res. Lett., 17, 1777–1780, 1990. a

Johnston, M. J. S., Sasai, Y., Egbert, G. D., and Mueller, R. J.: Seismomagnetic Effects from the Long-Awaited 28 September 2004 M 6.0 Parkfield Earthquake, B. Seismol. Soc. Am., 96, 206–220, https://doi.org/10.1785/0120050810, 2006. a

Jones, M. C. and Sibson, R.: What Is Projection Pursuit, J. Roy. Stat. Soc., 150, 1–37, 1987. a

Karamanos, K., Peratzakis, A., Kapiris, P., Nikolopoulos, S., Kopanas, J., and Eftaxias, K.: Extracting preseismic electromagnetic signatures in terms of symbolic dynamics, Nonlin. Processes Geophys., 12, 835–848, https://doi.org/10.5194/npg-12-835-2005, 2005. a

Karamanos, K., Dakopoulos, D., Aloupis, K., Peratzakis, A., Athanasopoulou, L., Nikolopoulos, S., Kapiris, P., and Eftaxias, K.: Preseismic electromagnetic signals in terms of complexity, Phys. Rev. E Stat. Nonlin. Soft. Matter Phys., 74, 016104–016125, 2006. a

Kong, X., Su, K., Yukio, F., and Yoichi, N.: A Detection Method of Earthquake Precursory Anomalies Using the Four-Component Borehole Strainmeter, Open Journal of Earthquake Research, 7, 124–140, 2018. a

Linde, A. T., Johnston, M. J. S., Gladwin, M. T., Bilham, R. G., and Gwyther, R. L.: A slow earthquake sequence on the San Andreas fault, Nature, 383, 65–68, 1996. a

Ma, J. and Guo, Y. S.: Accelerated synergism prior to fault instability:evidence from laboratory experiments and an earthquake case, Seismol. Geol., 36, 547–561, 2014. a

Manshour, P., Saberi, S., Muhammad, S., Peinke, J., Pacheco, A. F., and M Reza, R. T.: Turbulencelike behavior of seismic time series, Phys. Rev. Lett., 102, 014101, https://doi.org/10.1103/PhysRevLett.102.014101, 2009. a

Maurizi, A.: On the dependence of third- and fourth-order moments on stability in the turbulent boundary layer, Nonlin. Processes Geophys., 13, 119–123, https://doi.org/10.5194/npg-13-119-2006, 2006. a

Nikolopoulos, S., Kapiris, P., Karamanos, K., and Eftaxias, K.: A unified approach of catastrophic events, Nat. Hazards Earth Syst. Sci., 4, 615–631, https://doi.org/10.5194/nhess-4-615-2004, 2004. a

Noda, H., Nakatani, M., and Hori, T.: Large nucleation before large earthquakes is sometimes skipped due to cascade-up-Implications from a rate and state simulation of faults with hierarchical asperities, J. Geophys. Res.-Sol. Ea., 118, 2924–2952, 2013. a

Ohsawa, Y.: Regional Seismic Information Entropy for Detecting Precursors of Earthquake Activation, Entropy, 20, 861, https://doi.org/10.3390/e20110861, 2018. a

Otsu, N.: A Threshold Selection Method from Gray-Level Histograms, IEEE Trans. Syst. Man. Cybern., 9, 62–66, 1979. a

Potirakis, S. M., Contoyiannis, Y., and Eftaxias, K.: Lévy and Gauss statistics in the preparation of an earthquake, Physica A, 528, 121360, https://doi.org/10.1016/j.physa.2019.121360, 2019. a

Qi, L. and Jing, Z.: Application of s transform in analysis of strain changes before and after wenchuan earthquake, J. Geodesy Geodynam., 31, 6–9, 2011. a

Qiu, Z., Lei, T., Zhou, L., and Kan, B.: Observed strain changes from 4-component borehole strainmeter network before 2008 wenchuan earthquake, J. Geodesy Geodynam., 29, 1–5, 2009. a

Qiu, Z., Lei, T., Zhang, B., and Guo, Y.: In situ calibration of and algorithm for strain monitoring using, Four-gauge borehole strainmeters (FGBS), J. Geophys. Res.-Sol. Ea., 118, 1609–1618, 2013.  a

Qiu, Z. H., Zhang, B. H., Chi, S. L., Tang, L., and Song, M.: Abnormal strain changes observed at Guza before the Wenchuan earthquake, Sci. Ch. Earth Sci., 54, 233–240, 2011. a

Qiu, Z. H., Lei, T., Zhang, B. H., and Mo, S.: Extracting anomaly of the Wenchuan Earthquake from the dilatometer recording at NSH by means of Wavelet-Overrun Rate Analysis, Chinese J. Geophys.-Ch., 55, 538–546, 2012. a

Santis, A. D., Balasis, G., Pavon-Carrasco, F. J., Cianchini, G., and Mandea, M.: Potential earthquake precursory pattern from space: The 2015 Nepal event as seen by magnetic Swarm satellites, Earth Planet. Sci. Lett., 461, 119–126, 2017. a

Sattin, F., Agostini, M., Cavazzana, R., Serianni, G., Scarin, P., and Vianello, N.: About the parabolic relation existing between the skewness and the kurtosis in time series of experimental data, Physica Scripta, 79, 45006–45009, 2009. a

Su, K. Z.: Earthquake-monitoring capability of borehole strainmeter Earthqauke, Earthquake, 5, 38–46, 1991. a

Sun, X., Wang, G., and Rui, Y.: Extracting high-frequency anomaly information from fluid observational data: a case study of the Wenchuan Ms 8.0 earthquake of 2008, Chinese J. Geophys.-Ch. 59, 1673–1684, 2016. a

Sura, P. and Sardeshmukh, P. D.: A Global view of non-Gaussian SST variability, J. Phys. Oceanogr., 38, 639–647, https://doi.org/10.1175/2007JPO3761.1, 2012. a

Thatcher, W. and Matsuda, T.: Quetarnary and geodetically measured crustal movements in the Tokai District, Central Honshu, Japan, J. Geophys. Res.-Sol. Ea., 86, 9237–9247, 1981. a

Yue, L., Zhuo, N., and Tian, Y.: Arrival-time picking method based on approximate negentropy for microseismic data, J. Appl. Geophys., 152, 100–109, 2018. a

Zhu, K. G., Chi, C. Q., Yu, Z. N., Xuan, F. M., and Li, K. Y.: Extracting borehole strain precursors associated with the Lushan earthquake through principal component analysis, Ann. Geophys., 61, 1593–5213, 2018. a