Nonlinear Processes in Geophysics Comparison of correlation analysis techniques for irregularly sampled time series

Geoscientific measurements often provide time series with irregular time sampling, requiring either data reconstruction (interpolation) or sophisticated methods to handle irregular sampling. We compare the linear interpolation technique and different approaches for analyzing the correlation functions and persistence of irregularly sampled time series, as Lomb-Scargle Fourier transformation and kernelbased methods. In a thorough benchmark test we investigate the performance of these techniques. All methods have comparable root mean square errors (RMSEs) for low skewness of the inter-observation time distribution. For high skewness, very irregular data, interpolation bias and RMSE increase strongly. We find a 40 % lower RMSE for the lag-1 autocorrelation function (ACF) for the Gaussian kernel method vs. the linear interpolation scheme,in the analysis of highly irregular time series. For the cross correlation function (CCF) the RMSE is then lower by 60 %. The application of the Lomb-Scargle technique gave results comparable to the kernel methods for the univariate, but poorer results in the bivariate case. Especially the highfrequency components of the signal, where classical methods show a strong bias in ACF and CCF magnitude, are preserved when using the kernel methods. We illustrate the performances of interpolation vs. Gaussian kernel method by applying both to paleo-data from four locations, reflecting late Holocene Asian monsoon variability as derived from speleothemδ18O measurements. Cross correlation results are similar for both methods, which we Correspondence to: K. Rehfeld (rehfeld@pik-potsdam.de) attribute to the long time scales of the common variability. The persistence time (memory) is strongly overestimated when using the standard, interpolation-based, approach. Hence, the Gaussian kernel is a reliable and more robust estimator with significant advantages compared to other techniques and suitable for large scale application to paleodata.


Introduction
Paleoclimate proxy data sample past regional and global climate variation. Through their analysis we can attempt to understand past environmental conditions and changes. In order to separate local from global effects, measures of association like linear correlation and cross spectral density estimation are traditionally employed to analyze these records. A crucial problem with these records it their irregular sampling in time due to the complex sedimentation/ accumulation rate. However, standard methods can not be applied when timescales and resolutions are different. This is not a problem in the geosciences only, as irregular observation of continuous-time processes also occurs in the detection of biomedical rhythms (Schimmel, 2001), astronomy (Edelson and Krolik, 1988;Scargle, 1981Scargle, , 1982Scargle, , 1989 or turbulence research, where the velocity of the flow can only be measured if seeding particles pass a measurement volume (Broersen et al., 2000;Harteveld et al., 2005). When our aim is to reconstruct the linear auto-or mutual dependencies of the underlying processes from the observations, we can estimate either (cross-) power spectra or correlation functions, as both are related to each other by the Fourier transform (Chatfield, Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union. 390 K. Rehfeld et al.: Comparison of correlation analysis techniques for irregularly sampled time series 2004). The irregular sampling of the time series makes direct use of the standard estimation techniques of association measures impossible, as they rely on regular observation times. For (cross-) power spectral density estimation, standard linear interpolation of these irregular observations onto a regular sampling causes an additional bias towards low frequencies in power spectral density (PSD) estimation (Schulz and Stattegger, 1997).
Historically, there are several approaches to overcome this problem. The concepts can be classified into four categories: (a) direct transform methods, (b) slotting techniques, (c) model-based estimators, and (d) time series reconstruction methods (Broersen et al., 2000).
The Lomb-Scargle (LS) periodogram, introduced for use in astronomy (Scargle, 1981(Scargle, , 1982, is a well-known direct transform method that computes a least squares fit of sine curves to the data. The obtained least squares spectrum detects peaks at high frequencies but turned out to be severely biased for turbulence spectra (Broersen et al., 2000) which do not possess periodic components. If the underlying assumption of least squares optimization, that the noise in the data is normally distributed, is fulfilled, then LS is equivalent to the Maximum-Likelihood estimate. Like all least squares techniques, the estimator is not robust in the presence of outliers. This is illustrated by the limitations of the method in the application to bimodal rhythms and signals with isolated outliers (Schimmel, 2001).
Standard slotting techniques determine the correlation function by binning all available products in the lag domain, so that observations only contribute to the correlation function at a lag if their observation time difference deviates less than half the lag bin width from the considered lag. This technique was proposed by Mayo in 1978 and further elaborated by Edelson and Krolik (1988). It has become popular in velocimetry (Broersen et al., 2000) and is frequently applied in astronomy (Böttcher and Dermer, 2010;Fan et al., 2010;Nieppola et al., 2009;Zhang et al., 2010). The disadvantage of this technique is that, without post-processing, the correlation function estimates are not necessarily positive semidefinite and the spectra computed from their Fourier transform can show negative power. Stoica et al. (2008), therefore, proposed a weighting technique for autocorrelation estimation which weighed observations based on a sinc kernel and claimed that it yielded positive semidefinite results. In their review,  also showed the application of other kernels in the time domain, including Laplacian and Gaussian kernels. The distribution of sampling time errors in time series reconstruction from paleo-archives is often assumed to be Gaussian, which, we believe, intuitively supports its use in time domain analysis.  proposed two techniques to estimate the correlation coefficient that he terms "binned correlation" and "synchrony correlation". "Synchrony correlation" consists of using the percentage of pairs of observations in the different time series that have the smallest measurement time difference, treat them as if they were observed coevally and calculate the correlation coefficient. "Binned correlation" essentially resamples the data into time bins on a regular grid that are assigned the mean values of the observations within these bins. Using these regular, reconstructed time series, the standard correlation estimator can be applied. We do not employ these two techniques because both do not utilize all available observations individually, which means loss of information. Also, since the standard estimator is used for calculation of the correlation coefficient, binning -or resampling -is problematic when data gaps are present and we want to estimate the correlation function.
Model-based estimators fit a model to the time series, the spectra or the ACF, which requires prior knowledge about the actual process (cf. Harteveld et al., 2005 and references therein), a prerequisite we typically cannot meet due to the heterogeneity and complexity of geophysical processes.
The fourth group of estimators resamples the data (through some kind of interpolation) in order to create time series on a regularly spaced grid, which then can be analyzed using the standard FFT-based estimators. The most frequently used technique in geophysical time series analysis is linear interpolation. Paleo data often has rather large data gaps and it is controversial if, when and how missing observations can be appropriately approximated. For standard interpolation (e.g. linear, akima-spline and cubic-spline) a significant reduction in variance toward the high-frequency range of the estimated power spectrum occurs in the analysis of irregularly sampled data (Schulz and Stattegger, 1997). When we are interested in phenomena on short timescales (compared to the mean sampling interval), such effects should be considered, and if possible, avoided.
Without objective performance tests of these estimators, application of specific methods is a matter of taste, but the chosen routine may not be the optimal method available. Therefore benchmark tests comparing various methods are crucial. One study, conducted for the estimation of power spectral density from flow velocimetry data in an engineering background, has been performed by Benedict et al. (1998). The test cases exhibited flat or simple exponentially decreasing spectra or contained a single deterministic sinusoidal component. They are therefore not nearly as complex as spectra in geophysical time series analysis typically are. Furthermore, they used a Poisson sampling scheme, which is reasonable in measurements with detector dead time, but less justified for paleo records.
In this paper we first review the methods that are or could reasonably be applied in the estimation of correlation functions of geophysical time series. This encompasses the standard approach, re-sampling by means of (linear) interpolation followed by a FFT-based routine, the LS periodogram, the slotting technique and kernel-weighted estimators.
We then -for the first time to our knowledge -compare and evaluate systematically the performance of methods suitable for estimating correlation functions of geophysical time series under the presence of varying sampling schemes, and we specifically quantify the extent and direction of estimator variance and bias due to sampling irregularity. We do this using a newly developed testing scheme, based on simulated time series with increasing inter-sampling time irregularity but constant mean sampling rate. In a last step we apply the methods to real proxy data from the Asian summer monsoon region, we evaluate the consistency of the results with respect to the synthetic tests and validate our ACF results further by the application of an independent least squares estimator for the persistence time of autoregressive processes of order 1 (AR(1)) (Mudelsee, 2002).

Methods
Assuming that two time series x t and y t were observed from stationary stochastic processes at unit time intervals, their sample CCFρ(k) gives an estimate of the strength of a possible linear association between the processes behind the observations at each possible lag number k. It is defined aŝ Here,γ xy (k) is the sample cross-covariance at lag k, N is the number of observations,σ x ,σ y the sample standard deviations of the processes andx,ȳ are the estimated mean values of the time series (Chatfield, 2004). The spacing of the CCF lags, τ , equals -in this standard definition -that of the time series x t and y t , τ = t x,y The discrete Fourier transform of the sample CCF is the sample cross spectral density function or cross spectrum and vice versa. The power spectrum can thus be estimated in two ways, either by computing the discrete Fourier transforms of the input time series and multiplying them after complex-conjugating one of them, or by estimating the CCF and Fourier transforming it (cf. Chatfield, 2004 for more details). We denote all estimators in the definitions in their respective sections byρ, for the sake of simplicity.

The resampling approach for irregular time series
In the case of irregularly sampled time series, the classical definition, as illustrated in Fig. 1a, can not be readily applied. An irregularly spaced time series is a pair (t x ,x) of tuples of common length N x , where t x 1 < t x 2 < ... < t x N x are the time points and x i is the value at time t x i . For simplicity we have transformed the time variable to get a normalized mean increment of 1 by dividing by the mean sampling period: t x i = t orig i / t x and we will use this notation in the following. The differences between observation times t x i = t x i − t x i−1 are not any more constant and the mean of their distribution is the mean sampling time t x . When we K. Rehfeld et al.  shows the classical estimator, where the correlationρ xy (k) is given by a mean over products of zero-mean observations a lag k apart. (B) For irregularly sampled time series, the slotted estimator computesρ xy (k) as the mean over all products in bins whose centers are a lag k apart. (C) Non-rectangular correlation uses the weighted mean over all available products with the weight maxima a lag k apart.

Method
Assuming th stationary st sample CCF sible linear a servations at Here,γ xy is the numb deviations o values of th the CCF lag of the time s The discr sample cross vice versa. two ways, e forms of th complex-con and Fourier details). We respective se

The re
In the case definition, a plied. An ir tuples of com the time poi ity we have ized mean i pling period in the follow ∆t x i = t x i −t of their distr we consider second-orde to be resamp constant tim n = 1,2,...N of the mean We restric polation tech are not much high-frequen 1997). A re duction in v the function the value of shows the classical estimator, where the correlationρ xy (k) is given by a mean over products of zero-mean observations a lag k apart. (B) For irregularly sampled time series, the slotted estimator computesρ xy (k) as the mean over all products in bins whose centers are a lag k apart. (C) Non-rectangular correlation uses the weighted mean over all available products with the weight maxima a lag k apart. consider irregularly sampled time series (t x ,x), (t y ,y) of second-order stationary processes with zero mean, these have to be resampled onto a common regular time grid (t x,y ) with constant time increments t x,y (n) − t x,y (n − 1) = t x for all n = 1,2,...N x,y . The grid spacing we will use is the larger of the mean sampling intervals of the time series.
We restrict ourselves in this analysis to the linear interpolation technique, as the effects of other standard routines are not much different in their variance reduction towards the high-frequency end of the spectrum (Schulz and Stattegger, 1997). A resampling method which does not result in a reduction in variance is the nearest neighbor technique, where the function is approximated at the desired grid points by the value of the observation closest in time. This leads to a shifting bias (Broersen, 2009) which, in the presence of large gaps in the data, can be rather large. We therefore do not employ this scheme. After resampling, the standard FFT-based routines can be employed.

Lomb-Scargle approach
The Lomb-Scargle approach to the spectral estimation of irregularly sampled data can be understood as a least squares fitting of sinusoids to data (Scargle, 1981). The Lomb-Scargle Fourier transform (LSFT) uses the explicit observation timest to ensure time invariance of the LSFT (Schulz and Stattegger, 1997). The coefficient F 0 allows for a time shift in the alignment of the two time series in bivariate spectral analysis. The amplitudes A and B are defined as In the univariate case, the well-known Lomb-Scargle periodogram is then given bŷ The (bivariate) cross spectrum can be estimated aŝ which can be inverted, using the Fourier transform (Scargle, 1989), to get the cross correlation coefficient estimatê The squared absolute value of the LSFT gives the widely known and used LS periodogram (Schulz and Stattegger, 1997). The choice of the frequencies ω is described in Scargle (1989) and we adopt the recommended values for the fundamental frequency ω 0 = ω min = π(N xy −1) (t max −t min )N xy and maximum frequency ω max = 2π t xy . In the bivariate case we define the observation times t min and t max as the lower and upper bounds of the overlapping part of both time series x t and y t , otherwise, in the univariate case, minimum and maximum observation time are used. t xy = max( t x , t y ) is the common sampling rate we define in the bivariate case. The number of frequencies N f = ofac·N xy determines the spacing of the frequency vector. According to Hocke and Kämpfer (2009) there is no principal limit, the oversampling factor ofac > 1 is regarded as a smoothing factor, although the number of independent frequencies is constant. We use ofac = 2, unless otherwise stated.
A thorough introduction to bivariate Lomb-Scargle spectral estimation was given by Schulz and Stattegger (1997). The use of the technique for correlation function estimation, however, has not yet been explored, though it was already proposed in Scargle (1989).

Correlation slotting
The sample correlation functionρ xy (k) at a lag k is calculated by averaging over the lagged products of the standardized observations. For irregular time series the inter-sampling times vary, and without resampling Eq. (1) cannot be applied.
An alternative is the slotting or Edelson and Krolik technique (Edelson and Krolik, 1988;Mayo, 1993). Its key idea is to calculate the cross-products of all available, standardized, observations and discretize them into bins according to their sampling time differences as can be seen in Fig. 1b. The technique was developed in fluid mechanics and applied in astrophysics.ρ(k τ ) at the lag k τ is then defined aŝ and the kernel b k (t y j −t x i ) selects the products whose time lag is not further than half the bin width from k τ : Note that the observations have to be standardized to zero mean and unit variance before the analysis. We set the lag bin width τ to be equal to t xy , and since we divide the observation times by this mean sampling interval, we can omit it in the formulae above, for easier readability (cf. Sect. 2.1). We do not choose this width arbitrarily but rather in the context of the desired time resolution of the CCF, more on this in Sect. 2.4.
There are, however, several disadvantages of this technique, primarily a high variance of the estimator Benedict et al., 2000;Harteveld et al., 2005) Fig. 2. Kernel-based estimators effectively "use" observations whose inter-sampling time difference is close to the lag for which linear correlation is estimated. Slotting (the rectangular kernel) chooses observations within an interval, Gaussian and sinc kernel weigh the products smoothly according to the difference between observation interval and desired lag. Kernels were scaled to the standard choice for width parameter h (cf. due to which we will not use this method in the following, but rather apply related, non-rectangular kernels. It also does not always provide positive semidefinite covariance matrix estimates, a problem which can be overcome by "fourier filtering". We discuss this further in Sect. 2.5.

Non-rectangular kernels
In analogy to the slotting approach, and taking it further, weighted averaging of the observations can be performed using symmetric, smooth density functions that tend to zero for time differences much larger or smaller than the desired lag k (Hall et al., 1994). The similarity is illustrated in Fig. 1c. These requirements are for example met by the sinc kernel (Stoica and Sandgren, 2006) but also the Gaussian kernel (cf. Table 1) as can be seen in Fig. 2. Instead of binning the observations into discrete sets, the weights prevent a sudden cutoff in the time domain.
There is no theoretical definition of the effective width of the weight functions. We decide to scale them to a kernel width of the mean sampling rate for two reasons. (i) This choice ensures that -for non-rectangular kernels -observations at (near-)regular times are rated higher than those that are further away, but are still included in cases where little information is available. (ii) In a trade-off between the loss of resolution and control of estimator variance, the desired resolution of the correlation function also plays a role, as a kernel width choice larger than the lag spacing would result in mixing information for adjacent lags. The width parameters for the kernels and their relation to the mean sampling rate were confirmed as empirical optima in case of irregular Table 1. Kernels b(d) used in this paper. d denotes the distance between the inter-observation time t xy ij and k τ , k denotes the k-th lag. The standard width parameter h is chosen to result in a main lobe width of t xy , the mean sampling interval or common sampling period in the bivariate case.

Standard choice for h
Rectangle; Edelson and Krolik (1988) Sinc; Stoica and Sandgren (2006) 1 N time series (cf. Fig. 3). Other parameter choices might, however, also be sensible, depending on the nature of the time series and the statistic to be estimated.

Positive semidefiniteness of the estimated function
In connection with the slotting-based covariance estimation, the issue with the possible lack of positive semidefiniteness of the correlation estimates has been discussed in Broersen (2002), Harteveld et al. (2005) and Stoica and Sandgren (2006). By Bochner's theorem, positive semidefiniteness of the correlation function is necessary and sufficient to ensure non-negativity of the Fourier transform estimate ofρ(t).
for all integrable functions w, and only if this holds trueρ(k) is a possible correlation function. For discrete, short, and regularly sampled time series, using Eq. (10) and a simple, integrable function for w, we can find this condition violated for all kernel methods. This problem can, amongst others, be solved by a technique called "Fourier filtering", which involves Fourier-transforming the correlation function estimate, setting any negative power estimates to zero and applying an inverse FFT afterwards to obtain a positive semidefinite correlation function estimate Hall et al., 1994). Another routine could involve using the absolute value of the power spectrum, instead of setting negative estimates to zero. Also, positive semidefinite matrices have non-negative eigenvalues, which is another means to test this property, and the same modifications as for the power spectra could be applied here. It should be kept in mind, however, that, due to numerical problems, even the "unbiased" 1/(N − 1) correlation estimator can result in negative power estimates. When the positive semidefiniteness of the correlation matrix is essential, Fourier filtering should be performed and/or the eigenvalues of the matrix should be checked.

Quality of performance measures
Our aim is to evaluate which of the approaches listed above yields the best results for the estimation of ACFs and CCFs for geophysical time series. The performance of the estimators can be evaluated with respect to the "true" expected functions. This can of course only be done for modeled or synthetic time series where we can calculate ACFs and CCFs exactly.
To evaluate the different estimators we calculate the root mean square error (RMSE) of the estimatorθ for a statistic θ. θ can be e.g. the cross correlation function at lag k, ρ xy (k). The RMSE is given by and incorporates both variance and bias of the estimator, i.e. its variability and its systematic deviation from the true value. To estimate the RMSE we generate a large number of time series of a given signal type and sampling scheme and compute the "target statistic"θ for each. The deviation between the mean of these many estimates and the 'true' function is the approximate bias of the estimator, together with the variance around this mean we can estimate the RMSE.
To evaluate the contribution of the sampling irregularity to the estimation error, we perform the analysis for different sampling schemes, first for regular sampling and then for more and more irregular sampling. This we do by drawing inter-sampling-time intervals from the Gamma-distribution and concatenating them into a time line for which we then generate a corresponding signal. Given the shape parameter α and the scale parameter β, the mean µ of the (α,β)distribution is given by µ = αβ, the variance by σ 2 = αβ 2 and the skewness by sk = 2/ √ α. For low skewness (in our case the lowest value was 0.1) the distribution is close to normal (cf. Fig. 7b). Since the higher order moments depend only on the shape parameter α, we can vary the scale parameter β in a way to keep the mean constant while increasing skewness and variance. We will only give the skewness parameter in the following, as the variance σ 2 = (2β/sk) 2 = (2µ/(sk α)) 2 is uniquely determined in our parameter configuration. A distribution with a skewness of 2.85 (Fig. 7b) results in a time series with large gaps, as large values become more likely in more and more skewed sampling interval distributions.

Comparison for synthetic records
To assess the adaptability and suitability of the different estimators, we perform a number of tests on artificially generated discrete signals for which we know the "true" ACFs and/or CCFs of the underlying processes. For each signal type we first estimate the RMSE in the case of regular sampling. Then we create time series with -distributed interobservation times with increasing skewness. Since the time vectors are artificial, they do not need to have an actual unit, but we assume that time is measured in years.

Sinusoids with random phase
Using techniques that are not (yet) fully established, our first concern is to make sure that the results for the standard, regularly sampled case are consistent with those from the standard estimators. Therefore we sample a simple signal, a superposition of three sinusoids: with ω i = 2π T i , T i = (18,21,41) yr at a regular rate of 1/4 years. The phase variable i,n is randomly drawn from a uniform distribution on (0,2π ), making this a sample from a stationary stochastic process. The true ACF is then a superposition of cosine functions ρ xx = 1/2 3 i=1 cos(ω i ), irrespective of the relative phases of the signal components. The length of the simulated time series is 1000 yr and we evaluate the function for 200 lags. Sample time series, mean ACF and power spectral density (PSD) of the mean ACF are depicted in Fig. 4. The kernel estimators, the LS periodogram as well as the "classical" method perform comparably well with a RMSE below 2 % (see Fig. 5, left columns) in the regularly sampled case.
We now use irregularly sampled observation times and perform a stepwise increase in sampling distribution  Fig. 4. Autocorrelation analysis of synthetic signals: for a regularly sampled combination of sinusoids (cf. Eq. 14) we give a sample time series (a), the sampling interval probability density (b), the expected correlation function (c) and the corresponding power spectrum (d) determined from 100 realizations of sinusoid time series with random phase arguments. Legends for each row are given in the right panels. All estimators perform equally well.

Fig. 5.
Mean RMSE for the ACF estimation (lags 1-3) using linear interpolation, Gaussian or sinc kernel or the inversion of the Lomb-Scargle periodogram of noise-free sinusoids given for regular, gamma-distributed and mildly irregular (skewness sk = 0.1) resp. very irregular (sk = 2.85) sampling. Errorbars give the standard deviation of the estimate, calculated using 1000 bootstrap iterations. skewness (as described in Sect. 2.6). For skewness sk = 0.1 the RMSEs are only slightly higher (Fig. 5, middle columns), but for a skewness sk = 2.85 the RMSE is as high as 40 % for interpolation and 35 % for the LS method. The estimated RMSE for the Gaussian kernel method, is rather small compared to that, with an approximate 12 %, lower than that of the sinc kernel method (23.5 %). We have increased the skewness in steps of 0.25 from sk = 0.1 to sk = 2.85 and note that the RMSE of the ACF seems to be increasing almost linearly for all the methods. For the LS estimate it jumps in the beginning, from 5 % to ≈ 20 %, and continues to increase at a rate of 9 % per unit skewness, with the breakpoint occurring at a skewness of 0.35. The RMSE of the interpolation followed by the FFT-based estimator (denoted "linint" in the figure legends) increases at a faster overall rate than all the other methods (6.5 % per unit skewness). The Gaussian kernel method has the lowest RMSE at high skewness and the lowest increase with respect to the estimate for regular sampling.
To investigate the reason for the differences between the methods further, we evaluate the RMSE of the power spectra obtained from the Fourier-transformed ACFs at the highest  Fig. 6. Autocorrelation analysis of synthetic signals: for an irregularly sampled combination of sinusoids (cf. Eq. 14) we give a sample time series (a), the sampling interval probability density (b), the expected correlation function (c) and the corresponding power spectrum (d) determined from 100 realizations of sinusoid time series with random phase arguments. Legends for each row are given in the right panels. High sampling irregularity leads to a variance reduction in the ACF for LS and interpolation. input signal frequency ω = 2π/18 (c.f. Figs. 4d and 6d). We find, that with increasing skewness, the RMSE of this peak increases from around 3 % to 10 % for interpolation and the LS correlation function estimate, while for sinc and Gaussian kernel it goes from < 1 % to approximately 2 %. Estimating the bias of this peak, we observe that the comparatively high RMSE for interpolation and LS method corresponds to a negative bias increasing linearly from 5 % to > 50 % with respect to the expected peak power at the high-frequency component. In contrast to that, the bias is nearly constant for the kernel methods, the slight increase in RMSE must therefore be due to an increase in variance. This lack of power in the high frequency component of the estimated spectrum is accompanied by a positive bias for the lowest frequency component ω = 2π/100 (results not shown).

Autoregressive processes
To understand the quantitative and qualitative effect of the different estimation techniques on the short-term correlative properties (e.g. the persistence time, the lag at which the ACF has dropped to t/e), we use AR(1) processes generated at high time resolution and then re-sample the observations onto the desired irregular sampling times. We perform the same simulations as before, first evaluating for regular sampling and then, for gamma-distributed inter-sampling intervals, where we subsequently increase the skewness of the interval distribution. The driving process is given by and for bivariate correlation analysis we sample a second process driven by the first at lag : ξ and ε are uncorrelated Gaussian distributed noise processes with a variance σ 2 ξ , σ 2 ε such that the overall process variances σ 2 x = σ 2 ξ /(1 − φ 2 ) and σ 2 y = σ 2 ε + (α 2 σ 2 x ). We choose the AR(1) coefficient as φ = 0.7, corresponding to a persistence time τ = − t/lnφ, the coupling strength α = 0.5, coupling lag = 1 and generate our time series (e.g. Fig. 7a) following the different sampling schemes (e.g. Fig. 7b). Then we set out to estimate φ and α from the time series. In the estimation of the AR(1) coefficient φ, the RMSE for interpolation increases from 2 % to 17 % and the error for the sinc-kernel increases from 6 % to 13.5 %. The LS technique results in the largest increase for high skewness with a RMSE of 52 %. The Gaussian kernel method remains more accurate with an increase from 2 % to 8.5 %.
The coupling strength α is the true value of the CCF at the coupling lag . A typical application in the geoscience context is the estimation of the degree of similarity for time series from different sources, with different sampling properties. Analyzing two time series of inter-sampling time distribution skewnesses sk x =0.1 sk y =2.85, we find that the CCF estimation at lag = −1 has a negative bias for all techniques. The bias of the LS technique is strongly negative, underestimating the true correlation by more than 65 %. Linear interpolation results in a 30 % lower estimated coupling strength, the sinc kernel method in 15 % and the Gaussian kernel estimate is negatively biased by 8 % with respect to the "true" coupling strength of 0.5 (Fig. 7e).
Looking at the performance under the increasing sampling time distribution skewness of time series y t (keeping sk x constant at 0.1), we find that the RMSE of the estimated α Fig. 8. Mean RMSE for the ACF estimation (lag 1) using linear interpolation, Gaussian or sinc kernel or the inversion of the LS Periodogram of time series from AR(1) processes (cf. Eqs. 15), given for regular, gamma-distributed and mildly irregular (skewness sk = 0.1) resp. very irregular (sk = 2.85) sampling. Errorbars give the standard deviation of the estimate, calculated using 1000 bootstrap iterations. increases for all methods, but least for the Gaussian kernel (Fig. 9).

Sinusoids with random phase in colored noise
For irregular time series, the effect of interpolation on the ACF estimation of noise-free sinusoids is that it seems to suppress high-frequency variability. For red-noise signals we find that it, similarly, leads to an overestimation of autocorrelation. To generate more "realistic" signals, we synthesize the above-mentioned sinusoidal signals (Eq. 14) with varying amounts of additive red (AR) noise: The sinusoidal components vary with the frequencies ω i = 2π T i , T i = [18,21,41] years. The time vector t is concatenated into a time line from random variables drawn from a gammadistribution with µ = 4 and sk = 0.1. The phase variable i,n is, for each realization n, randomly drawn from a uniform distribution on (0,2π). This makes the time series samples from stationary stochastic processes. ζ i represents a red noise process (cf. Eq. 15) whose variance s we vary in the range [0,1]. The persistence time τ is, for this intercomparison, fixed at τ = 4 (corresponding to φ ≈ 0.78). Since we adjust the overall variance of the process to equal unity, the signalto-noise ratio varies in proportion with s. The "true" ACF is then given by Varying s and using irregular time series (sk = 2) we find that the mean RMSE ofρ x (1) estimated for the Gaussian kernel method increases slightly from 5 % for sinusoidal signals (s = 0, cf. Fig. 10), to 7 % for pure red noise (s = 1). At the same time, the RMSE for the interpolation-based routine rises from 10 % to 15 %, that for the LS-technique decreases from 27 % to 19 %. The sinc kernel performs similar to the interpolation routine for sinusoidal signals with up to 30 % of noise, but has a higher RMSE for noise-dominated signals. For irregular time series with low inter-sampling-time distribution skewness (sk = 0.1) we find that the RMSE is maximal for medium signal-to-noise ratios, i.e. it is lower for purely deterministic and purely random time series than for the mixture of both (results not shown). For mostly deterministic time series, s 0.5, the LS technique has then the highest RMSE, while sinc and Gaussian kernel-based methods give more accurate results. For dominant red noise s 0.5, the LS technique gives good results with low RMSE, where at the same time the performance of the sinc kernel deteriorates. The interpolation-based FFT-routine is not the best choice for irregular time series, irrespective of the signal-tonoise ratios of the processes generating the time series. The increased RMSE for interpolation observable for the ACF estimates is due to a positive bias for ρ x (1). The RMSE of the kernel-based methods is lower and the ACF bias is constant and negligible. The high-frequency variability is systematically underestimated when using interpolation. The higher the persistence time τ in the AR(1) component, the lower are the advantages of the Gaussian-kernel based estimator, since the high-frequency variability in the signal is lower.

Summary of the synthetic tests
In all tests we performed in this section, we find that linear interpolation comes with two systematic effects. Firstly, it has a positive bias for ACF estimation and secondly, it has a negative bias in CCF estimation. Both effects become more severe with increasing sampling time distribution skewness. The LS technique performed well for the ACF estimation of slightly irregular autocorrelated time series but not for sinusoids. We find the opposite pattern for the sinc kernel: its RMSEs are low in the application to sinusoidal data -but high for the ACF of autocorrelated noise processes. The Gaussian kernel estimates are consistent and have the, or close to the, lowest RMSEs in all tests. Therefore we recommend the use of the Gaussian kernel-based estimator instead of -or in addition to -the standard interpolation routine for irregular time series with positive inter-sampling time distribution skewness, and especially in the presence of observation gaps.

Comparison for paleo data
We will now apply the Gaussian kernel estimator and interpolation followed by the standard FFT-routine to paleo records from the Asian Monsoon domain, to evaluate possible differences between the CCF/ACF estimates of these datasets, depending on the analysis technique. The Asian monsoon system (cf. Fig. 11) affects a large share of today's world population. Zhang et al. (2008) find its strength in the past 1800 yr to be correlated with agricultural and cultural prosperity, its weakening with periods of unrest and instability. It can be divided into the Indian and the East Asian monsoon subsystems (ISM and EASM), that transport moisture from different sources. Oxygen isotope ratios (δ 18 O) from cave records have been used to study the Holocene variability of monsoonal precipitation over China and India. While most of them show a millennial-scale trend, believed to be linked with the decreasing solar irradiation through the Holocene (Maher, 2008;Wang et al., 2005), sources for variability on shorter time scales are debated (Berkelhammer et al., 2010). In an inter-comparison of four published, acclaimed records of monsoonal precipitation from four different geographical locations we want to investigate the spatial and temporal consistency of linear dependencies among these time series. Cross-correlation analysis of monsoon records could give clues to the interrelationships between the different monsoon branches and their development with time. Autocorrelation analysis can, amongst other methods, give insights into the persistence inherent to the time series and is believed to increase before certain dynamical transitions (Scheffer et al., 2009). Persistence time (cf. Eq. 15) is a characteristic parameter for the time scales on which these climate processes operate.  For the late Holocene time span of 387-1100 BP, we estimate cross correlation and persistence time of four speleothem δ 18 O records (cf. Fig. 11), reconstructed from Dongge cave in southern China (Wang et al., 2005), Heshang cave in central China (Hu et al., 2008), Wanxiang cave in north-central China (Zhang et al., 2008) and from Dandak cave in southern India (Berkelhammer et al., 2010). The sample locations lie in different branches of the Asian monsoon and therefore enable us to assess spatial variability of the monsoon system. The data sets have quite different intersampling time distributions, with rather high time resolution (0.5a-3.9a) and considerable time uncertainties. The details of the overlapping part of the records, which we will use, are given in table 2. For all four records, δ 18 O variations are interpreted as mainly dominated by precipitation amount changes, thus reflecting summer monsoon strength (Wang et al., 2005;Hu et al., 2008;Zhang et al., 2008;Berkelhammer et al., 2010).
Prior to correlation analysis we subtract (nonlinear) trends from the records, that we estimate using a 500a wide Gaussian kernel smoother (high-pass filter), adapted for irregular sampling. For the standard approach of CCF (ACF) estimation the time series are then interpolated linearly to a regular grid with spacing of the larger of the mean sampling intervals (a spacing equalling the mean sampling period) of the respective two time series involved. This means that in case of the CCF comparison of Dandak and Wanxiang records, this CCF has a lag resolution of 3.31a, in case we compare the Wanxiang to the Dongge record the resolution is at 3.92 a.

Results from ACF analysis
First we look at the individual ACFs (e.g. Fig. 12c and d) and find that the Gaussian estimate shows a much stronger initial decline than that resulting from interpolation. To investigate whether this more pronounced decline, this lower persistence time τ (cf. Eq. 15), is due to a negative bias of the kernel method or to a positive bias of the interpolation we perform the additional least squares analysis (LSq). The estimator, implemented similar to that in Mudelsee (2002), fits a simplified Ohrnstein-Uhlenbeck process, a continuous-time AR(1) analog, to the time series. Its estimates are robust with respect to variations in sampling rates, irregularity and persistence time and show a small, but constant, bias (−10 %) and variance. We compare four results: from interpolation, followed by ACF estimation involving the FFT; from interpolation, followed by the LSq estimation; from the Gaussian kernel ACF estimate and from LSq analysis of the original record (cf. Fig. 13). We find a pronounced overestimation, up to a factor of two, when interpolation is involved. This is irrespective of whether τ was estimated via ACF or the LSq fit. The Gaussian kernel estimate is generally lower than that of LSq analysis, but differs by not much, except in the estimate for the Heshang cave record where it is 50 % lower. We could relate this to the differences in the respective sampling time distributions: The Heshang sampling time distribution shows high skewness and a large mean sampling period, both combining into a source of estimation error. Neither high skewness (Dandak) nor a lower sampling rate (Dongge) alone lead to such a deviation between the LSq and Gaussian kernel estimates, which is in agreement with the results from Sect. 3. It follows from this, that interpolation causes a strongly positive bias on persistence and the kernel-based estimate is slightly negatively biased. Thus, if in an analysis the two estimates coincide we could assume the result to be unbiased. On the other hand, we should exercise caution when the results from different methods disagree. Persistence times give a measure of memory in processes and are thus important to characterize time scales on which climate processes operate. As we see in this section, interpolation leads to a strong overestimation of persistence for irregular time series, with a bias changing also in relation to the skewness of the observation time distribution. Caution should therefore be exercised and additional methods employed when performing autocorrelation analysis of irregular time series.

Results from cross correlation analysis
Next, pairwise cross correlation functions were calculated for all four records. Only two combinations resulted in significant correlation at zero lag (Fig. 14a). The correlation coefficient of 0.29 (−0.17,0.21) for interpolation resp. 0.295 (−0.19,0.27) for the Gaussian kernel at lag zero between the Wanxiang and Dandak records is significant to the 95 % level in the two-sided test for zero correlation under the null hypothesis of the time series being sampled from autocorrelated red noise processes. In the brackets we give the estimated critical values of the test that were determined using AR(1) processes (with persistence times based on the LSq estimate) on the original time axis of the records.
The late Holocene section (387-1325 BP) of the record from Wanxiang cave correlates also significantly with that from Heshang cave with a lag zero correlation coefficient of 0.28 (−0.2, 0.23) based on interpolation and 0.28 (−0.2,0.19) from the kernel estimator. We find that the highfrequency variability of the estimated correlation function is more pronounced in the kernel estimate. However, the overall shapes of the functions agree well.
The lack of significant correlation between the other records could have several reasons. One may be that our estimators did fail to capture the "true" underlying monsoon variability common to all records. This is not unlikely, since there are time uncertainties and local influences to be taken into account, especially when analyzing records that are spaced so far apart and reconstructed over such a time span. It may well be that the strongest commonality between the records are trends on centennial to millennial time scales that cannot be reconstructed from a less than 1000 yr long overlapping section and that possible links operating on shorter time scales were obscured in the generation process. On the other hand, our time section includes the Northern hemispheric Medieval Warm Period (MWP, ca. 700 BP-1000 BP) and parts of the Little Ice Age (LIA, ca. 100 BP-400 BP), periods where monsoonal circulation seemed to be stronger (MWP) or weaker (LIA) according to Zhang et al. (2008).  The Asian summer monsoon is a large-scale atmospheric circulation phenomenon. During northern hemisphere cold phases, less energy available for its generation might have led to a weakening of the monsoon. In contrast, warm phases should have led to a strong circulation which results in an increased influence of the ISM on Chinese precipitation. This should be observable in an increased correlation between Indian and Chinese rainfall variation and, at the same time, an increased correlation between the δ 18 O records.
We therefore analyze two time slices (389 BP-700 BP and 700 BP-1100 BP) of the records separately. After significance testing -and considering lags of 0 to 30 yr absolute value -, we find a contrasting picture: while the Northern Chinese records correlate with the Indian Dandak record during the warm phase (MWP), this correlation is insignificant in the colder phase (towards the LIA) that followed (cf. Fig. 14c). On the other hand, while the southern Chinese Dongge record correlates with the more northern records from Heshang and Wanxiang caves during the LIA, this correlation is not significant during the MWP.
This points us towards a more differentiated interpretation of these correlations, emphasizing the geographical origins of these cave records. According to the "isotopic zones" shown in Maher (2008), Feng et al. (1999 and references therein, Wanxiang cave is located in a zone that is, at present, dominated by the ISM. Heshang and Dongge cave lie in an varying sampling irregularity. Compared to this, interpolation leads to an overestimation of this persistence time by a factor of two. This difference is especially unnerving as the frequency of observations recorded through paleo archives varies in dependence on climatic parameters (e.g. lower accumulation rates through less precipitation). A change in the inter-observation time distribution could then lead to an artificial change in the estimated persistence.
In the cross correlation analysis of the paleo records, the kernel-based lag-zero cross correlation functions are consistent for interpolation and cross correlation. We believe that this is due to the short time scales on which the interactions of the monsoon systems are recorded, as the advantages of the kernel-based method are not as pronounced for records with high persistence. The kernel-based cross correlation functions show more high-frequency variability which could be investigated through cross spectral analysis. We do not attempt to characterize it here, since this is outside the scope of this paper. The bias effects from interpolation could cause problems in the evaluation of phenomena emerging on time scales close to the actual mean sampling rate. This is where the kernel methods show significant advantages and especially the Gaussian kernel correlation method can provide high-resolution, robust estimates of time-dependent cross correlation coefficients.
In our cross correlation analysis of four Asian monsoon records in the time interval of 387BP-1100BP, we have found significant evidence that the Indian summer monsoon circulation influenced Chinese rainfall variability during the northern hemispheric MWP, as then the δ 18 O record from Dandak cave in India correlates with the central China records from Wanxiang and Heshang caves. During the colder phase after the MWP and into the LIA, significant cross correlation coefficients are found amongst the Chinese records, indicating a spatially more homogeneous moisture source. At the same time these records do not correlate with the Indian record during the LIA cold phase, pointing to less ISM impact on Chinese precipitation.
To summarize, we have shown that in correlation estimation of irregularly sampled time series, caution should be exercised when these records have an inter-observation time distribution that is strongly skewed. In the CCF estimation we found a strongly negative bias for the standard approach with interpolation for processes with little persistence. The advantages of the kernel-based estimators are higher for coupling on short time scales, compared to the sampling rate. This is especially interesting for the investigation of proxy data with low resolution. Our results indicate that the bias properties of the Gaussian kernel and the interpolation techniques have different signs in ACF estimation, indicating when sampling irregularity causes problems in the analysis. "isotopic zone" where both monsoonal branches are influential. However, Dongge cave lies closer to the southern zone that is, at present, dominated by monsoonal precipitation from the south east (South China Sea) but not from the south-west (ISM). Recent investigations show, that even within southern China, moisture sources and their isotopic signature, differ orthogonally to these "isotopic zones" , pointing at a stronger influence of the South East monsoon in direction of Dongge cave. We conclude that during warm phases our results are consistent with these isotopic zones, since the records from central China correlate with the Indian Dandak record. In the cold phases, the atmospheric circulation might have been different, emphasizing the south east moisture source for allover China, evident through a correlation between the Dongge cave record and the more northern Chinese records and, since we observe no significant correlation with the Dandak record, less ISM impact.
Interpolation and kernel-based estimation give similar results. The CCF estimates at and around lag zero were not -or not significantly -lower for interpolation where a significant correlation was detected. We believe that this is due to the long time scales on which these correlations are recorded, as the advantages of the kernel-based method are larger for low persistence (cf. Sect. 3).

Conclusions
Comparing different methods for analyzing correlations from irregularly sampled time series, we have found that the kernel-based method is robust and has a comparable -and often even lower -RMSE and bias than the traditionally employed schemes using interpolation in the application to synthetic records, for regular and irregular sampling.
For the interpolation and FFT-based routine we find a four to seven times increase in RMSE, predominantly caused by an increase in the absolute value of the bias. This bias is positive for ACF estimation but negative for CCF quantification and its magnitude scales linearly with sampling irregularity.
In all synthetic test cases we studied the Gaussian kernel was close to or was the estimator with the lowest RMSE. Its performance was slightly inferior to that of the sinc kernel for sinusoidal time series but significantly better for red noise ACF and CCF estimation, especially in the application to records with disparate sampling rates.
We find that the sinc-kernel performs well for ACF estimation of sinusoidal signals. It shows, however, alternating bias patterns in the ACF for red noise time series, resulting in a high RMSE comparable to the FFT-based result. This might be due to the shape of the kernel with its positive and negative weights, thus emphasizing regular, deterministically recurrent structures that are not present in stochastic processes. Another reason for the mixed performance could be cutoff effects, since the kernel effectively presents a rectangular filter in the frequency domain.
The performance of the Lomb-Scargle periodogram-based routine showed advantages over interpolation for low skewness time sampling. For very irregular time series, in ACF as in CCF tests, we found a strong sampling effect resulting in a large bias.
In all tests we performed on synthetic data, we have found that linear interpolation comes with two systematic effects: It shows a positive bias for ACF estimation, and it has a negative bias in CCF estimation, emphasizing low-frequency variability at the cost of high-frequency components. Both effects become more severe with increasing sampling time distribution skewness and lower persistence in the processes from which we generate the time series. The Gaussian kernel estimates are consistent with those from interpolation for regular sampling and have the, or close to the, lowest RMSEs in all tests.
The estimated persistence time using the Gaussian kernel shows generally only a small negative bias with respect to the least squares estimate on the original record. The least squares persistence time estimator, which is, fitting an AR(1) process to the observations, has a constant and low bias for varying sampling irregularity. Compared to this, interpolation leads to an overestimation of this persistence time by a factor of two. This difference is especially unnerving as the frequency of observations recorded through paleo archives varies in dependence on climatic parameters (e.g. lower accumulation rates through less precipitation). A change in the inter-observation time distribution could then lead to an artificial change in the estimated persistence.
In the cross correlation analysis of the paleo records, the kernel-based lag-zero cross correlation functions are consistent for interpolation and cross correlation. We believe that this is due to the short time scales on which the interactions of the monsoon systems are recorded, as the advantages of the kernel-based method are not as pronounced for records with high persistence. The kernel-based cross correlation functions show more high-frequency variability which could be investigated through cross spectral analysis. We do not attempt to characterize it here, since this is outside the scope of this paper. The bias effects from interpolation could cause problems in the evaluation of phenomena emerging on time scales close to the actual mean sampling rate. This is where the kernel methods show significant advantages and especially the Gaussian kernel correlation method can provide high-resolution, robust estimates of time-dependent cross correlation coefficients.
In our cross correlation analysis of four Asian monsoon records in the time interval of 387 BP-1100 BP, we have found significant evidence that the Indian summer monsoon circulation influenced Chinese rainfall variability during the northern hemispheric MWP, as then the δ 18 O record from Dandak cave in India correlates with the central China records from Wanxiang and Heshang caves. During the colder phase after the MWP and into the LIA, significant cross correlation coefficients are found amongst the Chinese records, indicating a spatially more homogeneous moisture source. At the same time these records do not correlate with the Indian record during the LIA cold phase, pointing to less ISM impact on Chinese precipitation.
To summarize, we have shown that in correlation estimation of irregularly sampled time series, caution should be exercised when these records have an inter-observation time distribution that is strongly skewed. In the CCF estimation we found a strongly negative bias for the standard approach with interpolation for processes with little persistence. The advantages of the kernel-based estimators are higher for coupling on short time scales, compared to the sampling rate. This is especially interesting for the investigation of proxy data with low resolution. Our results indicate that the bias properties of the Gaussian kernel and the interpolation techniques have different signs in ACF estimation, indicating when sampling irregularity causes problems in the analysis.