Estimation of sedimentary proxy records together with associated uncertainty

Sedimentary proxy records constitute a significant portion of the recorded evidence that allows us to investigate paleoclimatic conditions and variability. However, uncertainties in the dating of proxy archives limit our ability to fix the timing of past events and interpret proxy record intercomparisons. While there are various age-modeling approaches to improve the estimation of the age–depth relations of archives, relatively little focus has been placed on the propagation of the age (and radiocarbon calibration) uncertainties into the final proxy record. We present a generic Bayesian framework to estimate proxy records along with their associated uncertainty, starting with the radiometric age–depth and proxy–depth measurements, and a radiometric calibration curve if required. We provide analytical expressions for the posterior proxy probability distributions at any given calendar age, from which the expected proxy values and their uncertainty can be estimated. We illustrate our method using two synthetic data sets and then use it to construct the proxy records for groundwater inflow and surface erosion from Lonar lake in central India. Our analysis reveals interrelations between the uncertainty of the proxy record over time and the variance of proxies along the depth of the archive. For the Lonar lake proxies, we show that, rather than the age uncertainties, it is the proxy variance combined with calibration uncertainty that accounts for most of the final uncertainty. We represent the proxy records as probability distributions on a precise, error-free timescale that makes further time series analyses and intercomparisons of proxies relatively simple and clear. Our approach provides a coherent understanding of age uncertainties within sedimentary proxy records that involve radiometric dating. It can be potentially used within existing age modeling structures to bring forth a reliable and consistent framework for proxy record estimation.

the fact that every proxy record has, within it by construction, errors related to the dating of the archive from which it was obtained. Clearly, proxies cannot be measured directly along time. They are measured along the depth of an archive, and the archive depths must then themselves be dated in a separate set of observations. In this sense, the proxy record as a function of time is a derived estimate, and it is one in which the time axis is not error free, because the radiometric ages of the archive depths have non-negligible errors of measurement. As a consequence of the latter, it is challenging to estimate the uncertainty of proxies as well as to carry out proxy record intercomparisons.
Quantifying the uncertainty of the proxy at a given time in the past is of paramount importance. Even though the proxy record is a derived measurement, it is still, in its essence, a measurement, and the representation of any measured quantity without an uncertainty (error) of measurement can lead to misleading conclusions. For instance, the assessment of correlations between data sets can be dramatically influenced by whether or not the uncertainty of the data is properly represented (see Rehfeld and Kurths, 2014;Heitzig, 2013, Intro.).
Recent studies have raised this issue along with the fact that -although there are various approaches to estimating age-depth relationships for a given archive and constraining their uncertainties -relatively little focus has been directed towards the propagation of the age uncertainties in the proxy record from the age model (Blaauw et al., 2007). Notably, in Fig. 2 of Blaauw et al. (2007), we find one of the first ever representations of the uncertainty of a proxy record (shown as a grayscale of possible proxy values), which quickly draws attention to the parts of the record that are more reliable and those parts that are not. This idea was further extended and applied in later studies (Parnell et al., 2008;Charman et al., 2009;Blaauw et al., 2010;Swindles et al., 2012). More recently, Mudelsee et al. (2009Mudelsee et al. ( , 2012 estimated the uncertainty of the proxy record with regard to trend and periodicity estimation of the proxy. Breitenbach et al. (2012) have put forth COPRA -Constructing Proxy Records from Agemodels a heuristic numerical method that treats age modeling as an intermediate step in the construction of the proxy record, and thereafter presents the final proxy records along with their uncertainties of estimation. However, a general, mathematical, and non-numerical framework (based on thorough estimation-theoretic principles) that can guide the estimation of the uncertainty of proxy record estimations is still lacking.
The main objective of this paper is to provide mathematical expressions for the probabilities of possible proxy values at a given point in time that can then be used to estimate the proxy record and its uncertainties. We achieve this by using a Bayesian analysis that tells us which depths are more likely than others for every calendar age -and not, as is the case in many studies, the other way round. Using conditional probabilities, we represent the proxies on an error-free timescale with the intent of allowing easier intercomparison of proxies.
We test our approach using synthetic examples and then apply it to reconstruct Holocene proxy records of groundwater inflow and surface erosion from India.
Our approach is generic in the sense that it is valid for all radiometric dating information. Furthermore, to make it more widely applicable, we do not use any sedimentation model. In cases where a convincing and more specific sedimentation model is available, it can be included in our Bayesian framework as prior distributions, as is typically done in several age-modeling approaches (Blaauw et al., 2007;Bronk Ramsey, 2008;Parnell et al., 2011). In this regard, the approach presented in this study also has the potential to be incorporated into existing age modeling frameworks such as COPRA , Sta-lAge (Scholz and Hoffmann, 2011), clam (Blaauw, 2010), or OxCal (Bronk Ramsey, 2008. Moreover, it could also be modified and extended to be applicable to other dating methods such as dendrochronlogy, luminescence dating and tephrochronology. Lastly, we stress that the fundamental result of our analysis is the probability distribution of proxy values at individual time points in the past (we return to this point in more detail in Sect. 4.4). As such, the two distributions derived for the proxy values at two distinct time points tell us almost nothing about the relationship between these two proxy values, for which one would rather need to estimate the joint distribution of the two -a task that is not attempted here, but that could be dealt with with similar methods. A similar situation occurs in the estimation of the radiocarbon calibration curve, and is discussed in Blackwell and Buck (2008). Still, in visualizing the distributions (and mean/median estimates derived from them) obtained at numerous time points over a contiguous time period, the observer might "see" trends and patterns of variability, and may be tempted to interpret these as trends and variations in the actual paleoclimatic conditions. In order to quantify trends and variations in the proxies, the analysis presented in the following sections has to be modified with the specific goal of arriving at, let us say for example, the probability distributions of the amount of change in the proxy values at individual time points in the past. Conclusions about different aspects of the proxy record based on the current analysis, on anything other than the probability distribution of the proxy at an individual point in time, can at best be qualitative, and is strictly limited by the level of uncertainty involved. The greater the uncertainty, the less we can infer.

Preliminary considerations
We distinguish between two types of radiometric dating methods: (i) those that do not require calibration, e.g., U/Th dating, and (ii) those that require radiometric calibration, The box presents a rough sketch of the derivation of the posterior proxy probability densities. We denote the proxy, depth, time and radiometric age with the random variables X, Z, T , and R respectively. The essential quantity of interest to be estimated is the posterior conditional density P (x|t) which is finally expressed in terms of the measured/given quantities using the above steps. For a detailed explanation, please refer to Sections 2.3, and 2.4. the second one, i.e., with a trivial diagonal calibration curve 160 without uncertainty. The information that is given is: (i) a set of radiometric ages against the depth of an archive, viz., a dating table, (ii) a much larger set of proxy values against the depth of the archive, and (iii) for the case of radiocarbon dated archives, 165 a 14 C calibration curve, viz., IntCal13 (Reimer et al., 2013). All three datasets have errors of measurement, although the errors in proxy measurements are negligible in several cases.
Using the above data, we wish to answer the question: What were the most likely climatic conditions at a given point 170 in time in the past? We can rephrase this in terms of proxies as: At a given time point in the past, which proxy values are more likely to have occurred? The answer is obtained using a probabilistic framework, i.e., we want to arrive at probability distributions for the proxy value at all past time points. 175 We use a Bayesian framework, since this allows us quite naturally to transform partial knowledge about one type of information (here: about the errors in measurement, dating, and calibration) into partial knowledge about other types of information (here: the age-depth and age-proxy relation-180 ship we are interested in). The relevance of a Bayesian approach is also exemplified by its use in most analytical investigations into radiocarbon dating and radiocarbon calibration (e.g. Niklaus et al., 1994;Niu et al., 2013). Readers may also refer to the introductory sections of Bronk Ramsey (2008) 185 and Parnell et al. (2011) for further explanatory notes on the use of Bayesian statistics in age modeling approaches.
The goal of this analysis is to arrive at the posterior distributions of the given proxy at all points in time. A gist of the derivation of the posterior proxy probability distributions 190 is presented in the box in Figure 1. A detailed explanation follows in the subsequent sections and also in the Appendix.

Necessary assumptions
Before we proceed, it is important that we state and review the assumptions inherent in our approach. We assume that:

195
-The 14 C ages, as well as the proxy measurements, are sufficiently well-described by a normal distribution with the mean at the measured value and a standard deviation equal to measurement error. This is motivated by the fact that, in most cases, observations (and associated 200 errors) are adequately described by the mean (and standard deviation) of a Gaussian process. Still, in principle, our method can be used for any probability distribution.
-The errors in depth measurements are negligible. The precision of depth measurements motivate this second 205 assumption and it is made for the sake of analytical simplicity. We note that most of the existing age modeling techniques involve this assumption, and also that our approach would have to be modified in cases where depth uncertainties cannot be neglected.
210 Figure 1. Outline of posterior proxy probability derivation. The box presents a rough sketch of the derivation of the posterior proxy probability densities. We denote the proxy, depth, time and radiometric age with the random variables X, Z, T , and R, respectively. The essential quantity of interest to be estimated is the posterior conditional density P (x|t), which is finally expressed in terms of the measured/given quantities using the above steps. For a detailed explanation, please refer to Sects. 2.3 and 2.4. e.g., 14 C dating. In the following subsections, we restrict ourselves to the more complicated case of 14 C dating, since the first type can formally be considered to be a special case of the second one, i.e., with a trivial diagonal calibration curve without uncertainty. The information that is given is (i) a set of radiometric ages against the depth of an archive, viz., a dating table, (ii) a much larger set of proxy values against the depth of the archive, and, (iii) for the case of radiocarbon dated archives, a 14 C calibration curve, viz., IntCal13 (Reimer et al., 2013). All three data sets have errors of measurement, although the errors in proxy measurements are negligible in several cases.
Using the above data, we wish to answer the question what were the most likely climatic conditions at a given point in time in the past? We can rephrase this in terms of proxies as at a given time point in the past, which proxy values are more likely to have occurred? The answer is obtained using a probabilistic framework; i.e., we want to arrive at probability distributions for the proxy value at all past time points.
We use a Bayesian framework, since this allows us quite naturally to transform partial knowledge about one type of information (here, about the errors in measurement, dating, and calibration) into partial knowledge about other types of information (here, the age-depth and age-proxy relationships we are interested in). The relevance of a Bayesian approach is also exemplified by its use in most analytical investigations into radiocarbon dating and radiocarbon calibration (e.g., Niklaus et al., 1994;Niu et al., 2013). Readers may also refer to the introductory sections of Bronk Ramsey (2008) and Parnell et al. (2011) for further explanatory notes on the use of Bayesian statistics in age-modeling approaches.
The goal of this analysis is to arrive at the posterior distributions of the given proxy at all points in time. A gist of the derivation of the posterior proxy probability distributions is presented in Fig. 1. A detailed explanation follows in the subsequent sections and also in the Appendix.

Necessary assumptions
Before we proceed, it is important that we state and review the assumptions inherent in our approach. We assume that -The 14 C ages, as well as the proxy measurements, are sufficiently well described by a normal distribution, with the mean at the measured value and a standard deviation equal to measurement error. This is motivated by the fact that, in most cases, observations (and associated errors) are adequately described by the mean (and standard deviation) of a Gaussian process. Still, in principle, our method can be used for any probability distribution.
-The errors in depth measurements are negligible. The precision of depth measurements motivate this second assumption, and it is made for the sake of analytical simplicity. We note that most of the existing agemodeling techniques involve this assumption, and also that our approach would have to be modified in cases where depth uncertainties cannot be neglected.
-All radiocarbon ages and depths are assumed to be equally likely a priori. This prior belief is then www.nonlin-processes-geophys.net/21/1093/2014/ Nonlin. Processes Geophys., 21, 1093-1111, 2014 updated within the analysis using the set of radiocarbon age-depth measurements.
-The radiocarbon ages provided do not contain outlying values (age reversals) caused, e.g., by contaminated material. Dealing with outliers is beyond the scope of this analysis.
-All obtainable growth-related information about the archive is contained in the given radiocarbon dating points. We note that, although at times, additional sources of growth-related information are available, we do not consider such situations in this preliminary exposition.
-The radiocarbon age of a certain position in the archive is an unknown, smooth, continuous function of its depth, denoted here as the radiometric (RM) age model. The presence of a hiatus is a crucial issue regarding this. We advise that if a hiatus is known to have occurred at a particular depth (from sources other than the radiometric ages), the RM age model can be split at the hiatus depth into two smaller, independent RM age modelsan approach used in Breitenbach et al. (2012).

Depth-spanning weight functions
In his discussion on deposition models for chronological records of paleo-archives, Bronk Ramsey (2008) has aptly articulated the fundamental idea behind the construction of age models as "What we are aiming to do mathematically is [. . . ] to find a representative set of possible ages for each depth point in a sedimentary sequence." Mathematically speaking, the fundamental idea behind our approach is similar, as we too seek to establish a probabilistic relation between the depth and age as a first step. However, our approach differs from the idea above in that we obtain this relation in the opposite direction; i.e., we aim to find an ensemble of depths for each calendar age. Let Z, R, and T be three variables that denote depth, radiocarbon age, and calendar age, respectively. Also, let the unknown proxy variable be denoted by X. We begin the task of finding such an ensemble of depths for each calendar age by asking: what is the probability that a depth Z = z corresponds to a given calendar age T = t? By the law of total probability, this is proportional to the probability of finding a radiocarbon age R = r at T = t multiplied by the probability that the depth Z corresponds to r, integrated over all possible values r of R. Formally, P (z|t) ∝ drP (r|t)P (z|r). (1) Now, using Bayes' theorem, we see that the second term on the right-hand side of Eq. (1) is a form of a posterior distribution, and is thus proportional to the product of some prior probability P (z) and the likelihood P (r|z) (for a more detailed discussion, see Sect. A). Assuming a priori that all depths are equally likely, we use the "flat" prior P (z) ∝ const. Thus, if the proxy is measured at depths Z = z x j for j = 1, 2, . . . , N, using the example of Eq. (1), we can define a depth-spanning weight function (DWF) at all proxy measurement depths z x j as The term P (r|t) is simply the radiocarbon calibration information that gives the probabilities of possible radiocarbon ages r given a particular calendar age t. The second term in Eq.
(2), P (r|z x j ), is the RM age model that gives the probability of the radiocarbon age r given a particular depth z x j . In combining these two terms under the integral, the DWF w t (z x j ) constructed is thus proportional to the probability that a depth z x j will correspond to a given calendar age t in the archive. For each different value t of the calendar age T , a new DWF has to be constructed over the depths z x j . A schematic of two representative DWFs for a simulated archive is shown in Fig. 2 along with the information used in their construction. A couple of points to note from Fig. 2 are (i) that the shape of the DWF can be quite different for different calendar ages, and (ii) that the peak of the DWF denotes the most probable value of Z for a given RM age model and a given value of T . This approximately corresponds to the value of Z at which the expected value of R will equal the expected value of R for the given value of T . We illustrate this point with the help of the arrowed dashed lines in Fig. 2.
Note: conventionally, the radiocarbon calibration involves estimating the probabilities of all possible calendar ages for a given radiocarbon age along with the uncertainty in estimation. This is analogous to proceeding in an counterclockwise direction in Fig. 2. In our analysis, we avoid "calibration" in this sense by choosing to estimate a relationship between calendar ages and corresponding depths in the other directionas shown by the clockwise sense of the arrows in Fig. 2.
At this stage, it becomes important that we elaborate on the construction of the RM age model and its relevance.

The RM age model
The construction of the DWFs involves the term P (r|z x j ), which has to be well defined at each depth where the proxy is measured. However, RM age observations are limited to significantly fewer numbers of depth points. Let us say that the number of radiocarbon age measurements is M, where M N , and the corresponding depths are denoted by z r k , k = 1, 2, . . . , M. Then, from this set of measurements, we get M conditional probability distributions P (r|z r k ) for R, k = 1, 2, . . . , M. Since M is much less than N, we need to be able to use the set of RM age-depth observations to construct a data set that gives us the N probability distributions A. Radiocarbon calibration curve representing the mean (red) and ±2 standard deviations (blue) of the distribution P (r|t). Inset: Calibration curve around 5.1 kBP detailing non-monotonicities. B. (i) 14 C measurements from a simulated archive (circles with error bars representing the distribution P (r|z r k )) and the RM age model obtained from them: mean (red) and ±2 standard deviations (blue) of the distribution P (r|z x j ) that results from the used regression method. (ii) DWFs for the calendar ages 2.2 kBP (purple) and 9.5 kBP (orange). The corresponding colored dashed lines with arrows indicate how a given calendar age is related to a distribution over probable depths via the DWF. All uncertainty bounds correspond to ≈ 95% confidence. (Color online.) k = 1, 2, . . . , M . Since M is much less than N , we need to be able to use the set of RM age-depth observations to construct a dataset that gives us the N probability distributions P (r|z x j ), j = 1, 2, . . . , N . This is achieved by estimating an RM age-depth relation, i.e., the RM age model.

315
In principle, the role of the RM age model can be seen as: where k = 1, . . . , M , and j = 1, . . . , N,. This resonates with the conventional framework of age modeling, except 320 that we use radiocarbon ages instead of calendar ages.
In this study, we use a non-parametric Taylor-polynomialbased regression method given by Heitzig (2013), a datadriven approach that uses a combination of Bayesian updating and Taylor expansion about a point of interest to provide 325 an estimate of the smooth curve from which the observations have been sampled. However, any regression method that estimates a posterior probability distribution could be used equally, as well as any Monte-Carlo-based method that generates ensembles of interpolated RM age-depth relations 330 from the set of observations to arrive at a mean/median estimate along with a standard deviation.
Note: In the case of dating methods that do not require radiometric calibration (U/Th dating being one such example), the first term of the right-hand side in (2) is equal to 1 if r = t 335 and 0 if r = t. In such cases, even though the fundamental idea behind (3) remains intact, the age model involved in it is no more the RM age model but is, in fact, the (calendar) age model-or what is simply known as the age model of the archive.

Estimating the proxy and its uncertainty at an individual time point
Once we have estimated the set of w f functions, it is straightforward to estimate the proxy and its uncertainty. For this, we need to consider the probability encoded in w f t (z x j ), for each 345 depth z x j , as a weight for the corresponding proxy measured at that depth. Since X denotes the unknown proxy at a given calendar age t, we thus estimate the probability P (x|t) as,

350
where the weights are derived from the corresponding DWFs at T = t (see Appendix A4 for details).
We now have a probability distribution for the proxy values at any given time t and using this, we can estimate the mean/median, as well as uncertainty bounds constructed us-355 ing percentiles or variance. In this study we restrict ourselves to median proxy values and represent the associated uncertainty of estimation with (i) a 95% confidence band constructed from the region lying between the 2.5 th and 97.5 th percentiles, and (ii) a 50% confidence band constructed from 360 the region lying between the 25 th and 75 th percentiles of the distribution P (x|t). Appendix A4 provides a more thorough treatment of the above. Also, in the following sections, the mean/median curves are represented as dotted lines rather than continuous curves to emphasize that they are derived  Radiocarbon calibration curve representing the mean (red) and ±2 standard deviations (blue) of the distribution P (r|t). Inset: calibration curve around 5.1 kBP detailing non-monotonicities. (b) (i) 14 C measurements from a simulated archive (circles with error bars representing the distribution P (r|z r k )) and the RM age model obtained from them: mean (red) and ±2 standard deviations (blue) of the distribution P (r|z x j ) that results from the used regression method. (ii) DWFs for the calendar ages 2.2 kBP (purple) and 9.5 kBP (orange). The corresponding colored dashed lines with arrows indicate how a given calendar age is related to a distribution over probable depths via the DWF. All uncertainty bounds correspond to ≈ 95 % confidence. (Color online.) P (r|z x j ), j = 1, 2, . . . , N. This is achieved by estimating an RM age-depth relation, i.e., the RM age model.
In principle, the role of the RM age model can be seen as where k = 1, . . . , M, and j = 1, . . . , N. This resonates with the conventional framework of age modeling, except that we use radiocarbon ages instead of calendar ages.
In this study, we use a non-parametric Taylor-polynomialbased regression method given by Heitzig (2013), a datadriven approach that uses a combination of Bayesian updating and Taylor expansion about a point of interest to provide an estimate of the smooth curve from which the observations have been sampled. However, any regression method that estimates a posterior probability distribution could be used equally, as well as any Monte Carlo based method that generates ensembles of interpolated RM age-depth relations from the set of observations to arrive at a mean/median estimate along with a standard deviation.
Note: in the case of dating methods that do not require radiometric calibration (U/Th dating being one such example), the first term of the right-hand side in Eq. (2) is equal to 1 if r = t, and 0 if r = t. In such cases, even though the fundamental idea behind Eq. (3) remains intact, the age model involved in it is no longer the RM age model, but is, in fact, the (calendar) age model -or what is simply known as the age model of the archive.

Estimating the proxy and its uncertainty at an individual time point
Once we have estimated the set of w f functions, it is straightforward to estimate the proxy and its uncertainty. For this, we need to consider the probability encoded in w f t (z x j ), for each depth z x j , as a weight for the corresponding proxy measured at that depth. Since X denotes the unknown proxy at a given calendar age t, we thus estimate the probability P (x|t) as where the weights are derived from the corresponding DWFs at T = t (see Appendix A4 for details). We now have a probability distribution for the proxy values at any given time t and, using this, we can estimate the mean/median, as well as uncertainty bounds constructed using percentiles or variance. In this study, we restrict ourselves to median proxy values and represent the associated uncertainty of estimation with (i) a 95 % confidence band constructed from the region lying between the 2.5th and 97.5th percentiles, and (ii) a 50 % confidence band constructed from the region lying between the 25th and 75th percentiles of the distribution P (x|t). Appendix A4 provides a more thorough treatment of the above. Also, in the following sections, the mean/median curves are represented as dotted lines rather than as continuous curves, to emphasize that they are derived from the probability distribution at individual time points and do not have any relation to values at other time points.

Incorporating monotonic growth
The DWFs constructed in Sect. 2.3 relates any given calendar age probabilistically to different depths, based on how likely they are to correspond to that age. This relation does not, however, include one specific feature of sedimentary records: the constraint of stratigraphically ordered growth. In other words, a stratigraphically deeper layer cannot be younger than any layer above it. We thus need to incorporate this constraint into the initial DWFs derived in Eq. (2), henceforth denoted as w i , and obtain a final set of DWFs (henceforth w f ) that takes it into account.
The DWFs are essentially a set of probability distribution functions, and to impose an unambiguous monotonicity constraint on such a set is non-trivial. To overcome this, we consider the cumulative probabilities of each DWF. A cumulative depth-spanning weight function (CDWF) is the probability that a given calendar age t corresponds to any depth less than or equal to a depth d x j . Formally, after ordering depths The CDWF, W i t (z x j ), by definition, increases along the depth axis z x j from 0 to 1. (Note: we denote the cumulative probability distribution with an uppercase letter "W " to distinguish it from the corresponding probability distribution "w" denoted by the lowercase letter.) Our task now is to ensure that it is monotonically decreasing along the age axis T . This means that if we take a depth z x j and two ages t 1 and t 2 such that t 2 is greater (i.e., older) than t 1 , the total probability that t 2 will correspond to a depth ≤ z x j cannot be more than the total probability that t 1 will correspond to a depth ≤ z x j -which is the condition of monotonic growth. Formally: we would like that W t 2 (z x j ) ≤ W t 1 (z x j ). The above condition for monotonic growth along the age axis is violated slightly but noticeably on many occasions. This is shown in Fig. 3a, where the wiggles in the white grid lines parallel to the calendar age axis illustrate the nonmonotonicity. Since it would be quite difficult to enforce the desired monotonicity already in the step where P (r|z x j ) is estimated from P (r|z r k ), we instead fix the slight nonmonotonicities after having derived W i . This results in a final CDWF W f that can then be transformed into the final DWF w f via To estimate the final W f that adheres to monotonicity, we use the principles of relaxation dynamics, and the details of this estimation process are discussed in Sect. A3. In short, we start with a suitably chosen set of CDWFs that are already monotonic in T , and then iteratively drag (pull and push) this function in order to minimize its distance from the initial CDWF, W i , as far as it is possible to do so without violating its monotonic nature. The final equilibrium set of CDWFs is that which cannot be moved any closer to W i by any form of dragging without compromising its monotonicity. We denote this set as W f (shown in Fig. 3b). 6. We use the posterior proxy probability distribution at each calendar age to estimate quantities of interest such as the mean or median proxy values for that age. Furthermore, we also estimate uncertainty estimates such 470 as quantile ranges or variance.

Applications
We first consider two synthetic examples in which we know the actual proxy record as well as the age model and test the performance of our approach in estimating proxy records.

475
Next, we estimate the groundwater inflow and surface erosion proxies from the Lonar lake in central India and com-

The age-depth sea cliff
The visualization of the CDWFs in Fig. 3 is likened to a sea cliff where W = 0 is shown as the blue sea, and W = 1 as the green highland. All intermediate values of W are contained in the sudden rise of the brown cliffs. As stated earlier, the fundamental idea behind age modeling is to arrive at a relationship between the calendar ages and the depths of a paleo-archive. A set of functions that perform this function can be thought of, in a broader sense, as an "age model". The CDWFs visualized in Fig. 3 are thus analogous to an age model in our analysis. However, we wish to emphasize that the construction of the CDWF relations did not involve assumptions about the growth and/or sediment accumulation of the archive, and was entirely a data-driven Nonlin. Processes Geophys., 21, 1093-1111, 2014 www.nonlin-processes-geophys.net/21/1093/2014/ estimation, with the sole input of the principle of monotonic growth of the core. In this sense, the age-depth sea cliff is a formal age model that, in future studies, could be developed further to incorporate specific growth conditions, leading to a better estimation of the (calendar) age-depth relation of the archive.

A review of the steps involved in proxy estimation
Before we present the applications and results, it is useful briefly to summarize the salient steps involved in estimating the proxy records and their associated uncertainty using our approach.
1. We construct the RM age model using an appropriate regression method that provides the posterior distributions of radiometric ages at the proxy measurement depths.
2. We estimate the DWF that relates any given calendar age to the proxy measurement depths in terms of the likelihood that they correspond to the chosen calendar age.
3. Next, we construct cumulative weight functions (CD-WFs) from the initial set of DWFs obtained in the previous step. The CDWFs are used to impose the constraint of stratigraphically ordered growth of the archive.
4. We thus obtain a final set of CDWFs that are consistent with such monotonic growth over time, and we derive a final set of DWFs from them.
5. For each chosen calendar age, we use the corresponding stratigraphically ordered DWF to weight the proxy measurements over depth and thus obtain a posterior proxy probability distribution.
6. We use the posterior proxy probability distribution at each calendar age to estimate quantities of interest such as the mean or median proxy values for that age. Furthermore, we also estimate uncertainty estimates such as quantile ranges or variance.

Applications
We first consider two synthetic examples in which we know the actual proxy record as well as the age model, and test the performance of our approach in estimating proxy records. Next, we estimate the groundwater inflow and surface erosion proxies from the Lonar lake in central India and compare our results with those obtained by using an age model generated using OxCal. A discussion of the results follows in the next section.

Synthetic examples
To illustrate our method as well as to test its efficacy, we consider two types of paleo-archives: (i) a stalagmite extending Fig. 4. U/Th dated synthetic stalagmite. Legend as in Fig. 2. A. The straight line of slope one used in place of a calibration curve. Inset: Unlike a real calibration curve, this line has no error. B. U/Th measurements from a synthetic stalagmite (circles with error bars) and the estimated radiometric age model obtained from them by regression (red line with blue ≈ 95% confidence band), along with actual age model (black dashes). C. The proxy curve along stalagmite depth obtained from noise-free proxy measurements. D. The actual proxy record (black), shown alongside the estimated median proxy record (red, dotted) along with associated uncertainty of estimation (sky blue denotes the interquartile range, i.e., 50% confidence, whereas light blue denotes the region between the 97.5 th and the 2.5 th percentiles, i.e., 95% confidence). The estimated record in D is demarcated into three distinct regions in terms of the frequencies it resolves: (a) when it resolves both frequencies of the true sinusoidal proxy record (

Holocene proxies from central India
As an application to a real-world scenario, we consider the set of age-depth 14 C measurements from the Lonar lake in  over 0-28 kBP dated with U/Th, and (ii) a lake sediment core extending over 0-11 kBP dated with 14 C. From our perspective, the crucial difference between the two is that for the lake sediment, the radiocarbon ages have to be calibrated using IntCal13, whereas this is not needed for the U/Th ages. To simulate sediment growth, we follow Blaauw (2010), such that the sediment accumulates with an initial growth rate of 20 yr cm −1 . At subsequent depths, a non-negative growth rate is chosen from a normal distribution that has the growth rate of the previous year as its mean and a fixed standard deviation of 7 yr cm −1 . In both cases, the proxy values are simulated as a sinusoidal signal consisting of two components with different time periods. Also, the proxy data sets were generated annually, i.e., with a proxy value for every year. We simulate a few noisy radiometric age measurements and a much higher number of almost perfect proxy measurements (error of 0.001). These "observations" are then used to estimate the proxy record with our method.

U/Th-dated archives
The results for the synthetic stalagmite are shown in Fig. 4. In this case, the calibration curve (as shown earlier in Fig. 2) www.nonlin-processes-geophys.net/21/1093/2014/ Nonlin. Processes Geophys., 21, 1093-1111, 2014 D is demarcated into three distinct regions in terms of the frequencies it resolves: (a) when it resolves both frequencies of the true sinusoidal proxy record (green), (b) when it resolves only the lower frequency (orange), and (c) when it is unable to resolve either of the two frequencies (purple). (Color online.) Z Value Fig. 5. 14 C dated synthetic lake sediment. Legend as in Fig. 4. (Color online.)

Holocene proxies from central India
As an application to a real-world scenario, we consider the set of age-depth 14 C measurements from the Lonar lake in We note that due to the difficulties of representing errors of XRF measurements, we consider the proxy observations along depth to be error-free. This, however, does not change the fundamental objective of our analysis, which is to estimate the final proxy uncertainties in an analytical fashion 560 and investigate how they are impacted by proxy-depth variability. If the proxy measurements were to have error, these would simply be added to the final errors as is indicated by Eq. A15. The final proxy records estimated are shown in Fig. 6. 565 We compare our results with proxy records obtained from a typical mean age model of the archive. The age model involved OxCal P-Sequence modeling with three sedimentological boundaries imposed a priori. Fig. 7 (panels A and B) compares the final proxy estimates obtained using the Ox-

570
Cal P-sequence age model with those obtained using our approach -for both Ca-area and Al-area.

Proof of concept
The synthetic examples shown in Sec. 3.1 illustrate the va-575 lidity of our approach. In panel D of Figs. 4 and 5, the first finding to note is that the 95% confidence band consistently contains well over 95% of the black curve, and the 50% band consistently contains about half of the black curve. In addi- Figure 5. 14 C-dated synthetic lake sediment. Legend as in Fig. 4. (Color online.) is replaced by a straight line of slope 1, without any error (Fig. 4a). This is possible because the U/Th radiometric ages can be identified with the calendar ages. The observational noise for the U/Th age measurements increases with the depth of the stalagmite to a maximum of 5 % (Fig. 4b).
The proxy signal has two components with time periods of 2000 and 400 years (Fig. 4d). Note that the proxy signal can be distorted in the depth domain, depending on the nature of the actual age-depth relation ( Fig. 4c and d). Figure 5 shows the results of our method as applied to the synthetic lake sediment core. In Fig. 5a, we see the irregularities of the radiocarbon calibration curve and its estimation uncertainty. The error in radiocarbon age measurements in Fig. 5b increases with depth, as in the previous case. The proxy signal used in this case has two time periods of 1000 and 200 years. Here too, one can see that the proxy signal is distorted in Fig. 5c, when compared to the one in Fig. 5d; however, the distortion in this case is mediated not only by the irregular RM age-depth relation, but also by the calibration curve.

Holocene proxies from central India
As an application to a real-world scenario, we consider the set of age-depth 14 C measurements from the Lonar lake in central India (Anoop et al., 2013;Prasad et al., 2014). The radiocarbon ages used for the analysis are tabulated in Appendix B1 and labeled in Fig. 6b. This included two 14 C ages after 1950, L21 and L20a, for which we use the Northern Hemisphere 3 (NH3) "post-bomb" calibration curve (Hua et al., 2013), and 17 pre-1950 ages, L19-L1, for which we use IntCal13 (Reimer et al., 2013). For the proxy records, we take the Ca-area proxy for groundwater inflow and the Al-area proxy for surface erosion

Un
From th it is app high pre uncerta 625 not neg lier stud  . (d, f) The corresponding proxy record estimates as obtained using the Bayesian approach detailed in the text. The legend in these panels is the same as that of Fig. 4d; "kcps" denotes "kilo counts per seconds". (Color online.) from the same archive at Lonar. The links of both the Ca to groundwater inflow (evaporitic carbonate (CaCO 3 ) formed during periods of low lake levels) and that of the Al to surface erosion (lithogenics brought on by rain events) have been validated in .
These were obtained from a continuous down-core Xray fluorescence (XRF) (Avaatech XRF Core Scanner III) scanning of the Lonar lake sediment core surface. The relative abundances of the elements (Ca, Al, Ti, Si, and K) were recorded every 5 mm with the X-PIPS SXP5C-200-1500 detector from Canberra, while the tube voltage was kept at 10 kV . The Al counts were found to be strongly correlated with the Ti, Si and K counts obtained from the XRF scanning (see Appendix B2). Due to this, and combined with the findings of Basavaiah et al. (2014), where they show the relation of the Al abundance to catchment erosion as well as the lithogenic contents, we choose this as a representative proxy for the Lonar lake surface erosion. We note that due to the difficulties in representing errors in XRF measurements, we consider the proxy observations along depth to be error free. This, however, does not change the fundamental objective of our analysis, which is to estimate the final proxy uncertainties in an analytical fashion and to investigate how they are impacted by proxy-depthvariability. If the proxy measurements were to have errors, these would simply be added to the final errors, as is indicated by Eq. (A15).
The final proxy records estimated are shown in Fig. 6. We compare our results with proxy records obtained from a typical mean age model of the archive. The age model involved OxCal P-sequence modeling with three sedimentological boundaries imposed a priori. Figure 7a and b compares the final proxy estimates obtained using the OxCal Psequence age model with those obtained using our approach -for both Ca area and Al area.

Proof of concept
The synthetic examples shown in Sect. 3.1 illustrate the validity of our approach. In Figs. 4d and 5d, the first finding to note is that the 95 % confidence band consistently contains well over 95 % of the black curve, and the 50 % band consistently contains about half of the black curve. In addition to this general fit between the true record and the estimated confidence bands, one can compare the median estimate (red dotted curve) with the true record (black curve), and distinguish three broad regions: (a) the youngest portion of the proxy records (green region), where the median estimates follow the true proxy series closely, and reproduce even faster oscillations of 1/400 yr −1 (for the stalagmite) and 1/200 yr −1 (for the lake sediment core) accurately; (b) the intermediate portion of the proxy records (orange region), where, for the most part, the median estimates show only the slower sinusoidal component due to larger dating uncertainties, but follow the lower frequencies of the true proxy curves (1/2000 yr −1 for the stalagmite, and 1/1000 yr −1 for the lake sediment core) closely; and (c) the oldest portion of the records (purple region), where the median estimates are almost flat curves, due to the high uncertainties. The differences between these three regions are due to the associated uncertainty in estimation of the proxy record, which increases (as seen from the confidence bands) progressively from the youngest to the oldest portions of the record, becoming as large as the range of values in the end. The proxy uncertainty depends strongly on the errors of the corresponding RM age models (Figs. 4 and 5b), which increase towards the oldest portion of the cores as well, and the errors of the RM age models are themselves influenced by the errors of the radiometric age measurements. We would like to emphasize that the objective of our method is not to estimate frequencies or variability, but to represent the available knowledge about the proxy value itself at each given point in time in the best possible way. The proxy records estimated by the present approach (red, dotted) compared to proxy records obtained by using an OxCal P-sequence age model (dark gray) for the groundwater inflow (Ca area, A) and surface erosion (Al area, B) proxies from Lonar lake. Legend for the confidence bounds to the Bayesian proxy estimate is same as in Fig. 6, panels D and F. (Color online.) oldest portion of the cores as well, and the errors of the RM age models are themselves influenced by the errors of the radiometric age measurements. We would like to emphasize 605 that the objective of our method is not to estimate frequencies or variability but to represent the available knowledge about the proxy value itself at each given point in time in the best possible way. The seeming inability of the proxy estimates to reproduce oscillations is a necessary consequence of the 610 posed research question when dating uncertainties are large. Still, in regions where the errors of measurement are small, the estimates capture the oscillations at both frequencies.
This discussion highlights two crucial factors: (i) proxy estimation errors depend on contingent errors of age mea-615 surements (and, to a large extent, the errors of calibration and proxy measurements); and (ii) the interrelation between the estimated proxy uncertainty and the variations that are resolved in the record. Both these issues are discussed in the subsequent sections.  The proxy records estimated by the present approach (red, dotted) compared to proxy records obtained by using an OxCal P -sequence age model (dark gray) for the groundwater inflow (Ca-area, a) and surface erosion (Al-area, b) proxies from the Lonar lake. The legend for the confidence bounds to the Bayesian proxy estimate is the same as in Fig. 6d and f. (Color online.) where the errors in measurement are small, the estimates capture the oscillations at both frequencies.
This discussion highlights two crucial factors: (i) proxy estimation errors depend on contingent errors of age measurements (and, to a large extent, the errors of calibration and proxy measurements), and (ii) the interrelation between the estimated proxy uncertainty and the variations that are resolved in the record. Both these issues are discussed in the subsequent sections.

Uncertainty of proxy estimations
From the proxy estimates shown in Figs. 4d-6d, it is apparent that even though the proxies are measured to high precision along the depth of the respective archives, the uncertainty in the proxy value for any given time point is not negligible. This is in agreement with the results of earlier studies, e.g., Blaauw et al. (2007) and Breitenbach et al. (2012). Also, the final uncertainty is not the same as the error of the proxy-depth measurements. The confidence bands span the whole range of values of the proxy for error levels of ≈ 5-10 % of the radiometric age measurements.
At a first glance at Figs. 4-6, it is obvious that the final proxy uncertainty is influenced by the calibration uncertainty, RM age model uncertainty, and the proxy measurement error (if any). However, a closer inspection of Fig. 6d and f around 3-4.5 kBP reveals an additional factor. At around 3-4.5 kBP, of ≈ 5-10% of the radiometric age measurements. At a first glance at Figs. 4-6, it is obvious that the final proxy uncertainty is influenced by the calibration uncertainty, RM age model uncertainty, and the proxy measurement error (if any). However, a closer inspection of Fig. 6 D and F 635 around 3-4.5 kBP reveals an additional factor. At around 3-4.5 kBP, we find that Al-area has much higher uncertainty in comparison to Ca-area even though both have the same calibration curve and age model. Moreover, even the proxy measurement error for both were considered negligible in the 640 analysis. Then why is the uncertainty much higher for Alarea?
To answer this, we proceed clockwise in Fig. 6 from the interval around 3-4.5 kBP to find that this range of calendar ages would roughly correspond to the depth range of around 645 500-800 cm using the given calibration curve and RM age model. The critical difference between the two proxies in this depth range is that the Al-area has a much larger variance in comparison to the Ca-area. Thus, given the same DWF, the Al-area proxy estimates would have a much larger spread 650 relative to the Ca-area.
To understand this in more detail, consider the schematic in Fig. 8. The proxy curve shown in the figure has distinctively high fluctuations in the purple portion of the curve and is then confined within a narrow band of values in the or-655 ange portion. On the left side of the figure, we consider two points t 1 and t 2 on the calendar age axis, such that the DWF w f t1 (d x j ) of t 1 covers mainly the high variability region A in Fig. 8 and similarly the DWF of t 2 covers the low variability region B in Fig. 8. Thus, we get a smaller uncertainty for the 660 proxy at t 2 than at t 1 when we weight the proxy values with the height of the respective DWFs. Hence, the variability of the proxy measured along the depth also contributes to the final uncertainty of its estimation.
The final uncertainty thus depends on four contingent fac-

Variability of the proxy record
From Section 4.2, it is evident that the uncertainty of proxy estimation is influenced by its own variability in the depth domain. However, the proxy uncertainty is also closely related to the variability of the median proxy estimate as well.

675
This is seen in Figs. 4-6, where regions of high estimation uncertainty are associated with low variability of the median estimates along time, and less resolution of higher frequencies, and vice versa. This does not mean that the proxy itself does not have faster variations. It simply implies that, in re-680 gions with high proxy uncertainty, we cannot reliably com- we find that Al area has much higher uncertainty in comparison to Ca area, even though both have the same calibration curve and age model. Moreover, even the proxy measurement errors for both were considered negligible in the analysis.

Then why is the uncertainty much higher for Al area?
To answer this, we proceed clockwise in Fig. 6 from the interval around 3-4.5 kBP to find that this range of calendar ages would roughly correspond to the depth range of around 500-800 cm, using the given calibration curve and RM age model. The critical difference between the two proxies in this depth range is that the Al area has a much larger variance in comparison to the Ca area. Thus, given the same DWF, the Al-area proxy estimates would have a much larger spread relative to the Ca area.
To understand this in more detail, consider the schematic in Fig. 8. The proxy curve shown in the figure has distinctively high fluctuations in the purple portion of the curve, and is then confined within a narrow band of values in the orange portion. On the left side of the figure, we consider two points t 1 and t 2 on the calendar age axis, such that the DWF w f t 1 (d x j ) of t 1 covers mainly the high-variability region A in Fig. 8 and, similarly, the DWF of t 2 covers the low-variability region B in Fig. 8. Thus, we get a smaller uncertainty for the proxy at t 2 than at t 1 when we weight the proxy values with the height of the respective DWFs. Hence, the variability of the proxy measured along the depth also contributes to the final uncertainty of its estimation.
The final uncertainty thus depends on four contingent factors: -calibration uncertainty (if any), -uncertainty of the RM age model, -errors in measuring the proxy along depth, and -variability of the proxy signal along the depth domain.

Variability of the proxy record
From Sect. 4.2, it is evident that the uncertainty in proxy estimation is influenced by its own variability in the depth domain. However, the proxy uncertainty is also closely related to the variability of the median proxy estimate as well. This is seen in Figs. 4-6, where regions of high estimation uncertainty are associated with low variability of the median estimates along time and less resolution of higher frequencies, and vice versa. This does not mean that the proxy itself does not have faster variations. It simply implies that, in regions with high proxy uncertainty, we cannot reliably comment on the fast variations of the proxy. In order to quantify analytically the various fast/slow varying components of the proxy and their uncertainties in a thorough fashion, we should, in principle, as stated before, proceed with a separate analysis. This is because the knowledge of the probability distributions of the proxy at a given time are not sufficient to comment on the variations, especially in the presence of non-negligible uncertainty in estimation.
The best one could say about the variability of the proxyvs.-time in regions of high dating uncertainty would be to estimate some aggregate measures of variability such as the slope or curvature of the proxy curve or the momentary amplitude of a certain sinusoidal component (at each calendar age). For example, a simple way of obtaining a "central" estimate of the slope dx/dt would be to use MoTaBaR to find the mean estimates R = r(t), D = d(r), and X = x(d) that correspond to given values T = t, R = r, or D = d, and then to use the chain rule to calculate dx/dt ≈ x (d(r(t))) d (r(t)) r (t). Equivalently, in graphical terms, follow the dashed lines from t via r and d to x and multiply the corresponding slopes of the calibration curve r(t), the RM age model d(r), and the proxy-vs.-depth curve x(d) that you encounter on the way (the calibration curve might have to be smoothened for this). In a similar fashion, the second derivative can be estimated by applying the product rule: d 2 x/dt 2 ≈ x (d(r(t))) d (r(t)) 2 r (t) 2 + x (d(r(t))) d (r(t)) r (t) 2 + x (d(r(t))) d (r(t)) r (t). Finally, if the proxy-vs.-depth curve shows a sinusoidal component of amplitude ξ and period length d around depth D = d(r(t)) (as could be seen, e.g., from a wavelet analysis), one can conclude that the true climate-vs.-time curve contains a sinusoidal component around time T = t of the same amplitude ξ and a period length that can be estimated as t ≈ d/d (r(t)) r (t).

Interpreting the posterior probabilities: a note on proxy variations
A critical point arising out of the previous subsections is that the final proxy estimate -such as the mean/median -when visualized over a period of time, may not reveal short-time variations. For paleoclimatic studies focussed on transitions taking place over short timescales, this can be a major hurdle. Even in studies that wish to address climatic patterns operating in the higher-frequency region, a proxy estimate that does not resolve such frequencies is of little practical utility. We stress that it is misleading to conclude that the proxy record does not contain high-frequency components based on figures such as Figs. 4-6. As already stated in the previous subsection, the fast varying components of the proxy are not ruled out by the probability distributions. Rather, only in estimating the mean or median might we be unable to say anything about them with confidence.
To understand how this is possible, note that the primary and foremost result of our approach is a probability distribution of the proxy values at each value of calendar age (shown in Fig. 9a, b as a color map). Such a visualization is in principle similar to Fig. 2 of Blaauw et al. (2007) -only that we obtain the visualization from mathematical expressions, and not as a histogram of ensemble members.
We interpret the distributions as representing the probability densities of an ensemble of possible proxy records; i.e., each member of this ensemble is a record of one of many possible past climatic histories that fit the available set of measurements and data. This is shown in Fig. 9c and d, in which two such members of the ensemble are shown for each of the two proxies from Lonar lake. They are constructed by drawing random numbers from each proxy probability distribution at every calendar age. It is immediately clear that the individual members of the ensemble retain the high-frequency components as well. However, since we have no way of knowing which of the infinite possible ensemble Goswami et el.: Estimation of sedimentary proxies 11 ment on the fast variations of the proxy. In order to analytically quantify the the various fast/slow varying components of the proxy and their uncertainties in a thorough fashion we should in principle, as stated before, proceed with a separate 685 analysis. This is because the knowledge of the probability distributions of the proxy at a given time are not sufficient to comment on the variations, especially in the presence of non-negligible uncertainty of estimation. The best one could say about the variability of the proxy-690 vs-time in regions of high dating uncertainty would be to estimate some aggregate measures of variability such as the slope or curvature of the proxy curve or the momentary amplitude of a certain sinusoidal component (at each calendar age). For example, a simple way to ob-695 tain a "central" estimate of the slope dx/dt would be to use MoTaBaR to find the mean estimates R = r(t), D = d(r), and X = x(d) that correspond to given values T = t, R = r, or D = d, and then use the chain rule to calculate dx/dt ≈ x (d(r(t))) d (r(t)) r (t). Equivalently, in graphi-700 cal terms, follow the dashed lines from t via r and d to x and multiply the corresponding slopes of the calibration curve r(t), the RM age model d(r), and the proxy-vs-depth curve x(d) that you encounter on the way (the calibration curve might have to be smoothened for this). In a similar 705 fashion, the second derivative can be estimated by applying the product rule: Finally, if the proxy-vs-depth curve shows a sinusoidal component of amplitude ξ and period length ∆d around 710 depth D = d(r(t)) (as could be seen, e.g., from a wavelet analysis), one can conclude that the true climate-vs-time curve contains a sinusoidal component around time T = t of the same amplitude ξ and a period length that can be estimated as ∆t ≈ ∆d/d (r(t)) r (t). A critical point arising out of the previous subsections is that the final proxy estimate -such as the mean/median -when visualized over a period of time, may not reveal short-time 720 variations. For paleoclimatic studies focussed on transitions taking place over short time scales this can be a major hurdle. Even in studies that wish to address climatic patterns operating in the higher frequency region, a proxy estimate which does not resolve such frequencies is of little practical utility. 725 We stress that it is misleading to conclude that the proxy record does not contain high frequency components based on figures such as Figs. 4-6. As already stated in the previous subsection, the fast varying components of the proxy are not ruled out by the probability distributions. Rather, only in 730 estimating the mean or median, we might be unable to say anything about them with confidence.
To understand how this is possible, note that the primary and foremost result of our approach is a probability distribu- Fig. 9. Posterior probability distribution and proxy ensemble members. A, B. The posterior proxy distributions for Ca-area (in A) and Al-area (in B) obtained by our approach. At every chosen value of the calendar age on the vertical axis, our method provides a probability distribution for the proxy along the horizontal proxy axis. The probabilities are indicated by the color-bar, with white representing zero probability. C, D. Two randomly chosen ensemble members (blue and green curves) for Ca-area (in C) and Al-area (in D) out of all possible proxy records given the probability distributions in A and B respectively. Such individual ensemble members retain the high frequency components as well, indicating that the high frequency information is still contained in the posterior probabilities. (Color online.) tion of the proxy values at each value of calendar age (shown 735 in Fig. 9 A, B as a colormap). Such a visualization is in principle similar to Figure 2 of Blaauw et al. (2007) -only that members actually constituted the actual climatic history, we estimate the mean/median climatic history and our confidence in it. The uncertainty bounds shown in this study represent the impossibility (given a set of measurements) of determining precisely the mean proxy value and hence, by extension, the mean paleoclimatic condition that it would represent.
In order to be able to have a very narrow uncertainty range, efforts must be made to reduce the various sources of error that contribute to the final proxy error. We discuss the possibilities and limitations of this in the next section.

Reduction of uncertainty
Let us take the example of the Lonar lake observations and ask: how can we reduce the final proxy uncertainty? For this, we have to look at the four factors that determine it. Among these, the calibration uncertainty cannot be reduced until a more tightly constrained calibration curve is released, the proxy-depth variance is beyond our control, and the proxy measurement error is already set to zero in our analysis. Thus, we are left with the sole option of reducing the RM age model error. This can be achieved with additional radiometric dating of the archive, or by incorporating layer-counted segments of the record that have relatively less error. However, since we do not consider layer-counted data in our approach presently, we will consider below the effect of adding more radiocarbon dating points.
We might plan to make a few more measurements, especially around those depths where the RM age model is not very precise, e.g., at around 700 cm (Fig. 6b). Still, a significant portion of the final uncertainty might also be due to the intrinsic variance of the proxy along depth, and we thus need to understand fully exactly how much of the final uncertainty is contributed by the age measurement errors. The highly non-trivial way in which the final uncertainty is related to the RM age model uncertainty (via the DWFs) makes it almost impossible to find a precise analytical answer to questions of the type: if we make two 14 C age measurements at depths d 1 and d 2 with a maximum error of ε, by what fraction z will the uncertainty at calendar age t going to go down?
We can nevertheless get some insight into how much error is contributed by the age uncertainty by considering a simple thought experiment. Let us assume that we are able to reduce the RM age model uncertainty to zero by taking N errorfree radiocarbon age measurements at the precise depths of proxy measurements. The variance of the DWF will then depend solely on the calibration uncertainty and, in conjunction with the proxy's intrinsic variance, this will determine the final proxy uncertainty. We can compare the uncertainty levels of the proxy before and after setting the age model error to zero. This is shown in Fig. 10a and b for the Ca-area and Al-area proxies ( Fig. 10c and d). We can see from the figure that the final uncertainty of the proxy is not reduced by a great amount ( Fig. 10b and d) -even when the uncertainty of the RM age model had been set to zero. Among the two proxies, the reduction of uncertainty in the Ca-area record is more than that of the Al-area record. This is because the Al-area signal has relatively higher variability than the Caarea signal (cf. Fig. 6c and e), and so the relative contribution of the age uncertainties to the final proxy uncertainty is less for the Al-area record than the Ca-area record. For both records, a reduction in age uncertainty resolves more higherfrequency variations than before, but not by a great amount.

12
Goswami et el.: Estimation of sedimentary proxies we obtain the visualization from mathematical expressions, and not as a histogram of ensemble members. We interpret the distributions as representing the probabil-740 ity densities of an ensemble of possible proxy records, i.e., each member of this ensemble is a record for one of many possible past climatic histories that fit the available set of measurements and data. This is shown in Fig. 9 C and D, in which two such members of the ensemble are shown for each 745 of the two proxies from Lonar lake. They are constructed by drawing random numbers from each proxy probability distribution at every calendar age. It is immediately clear that the individual members of the ensemble retain the high frequency components as well. However, since we have no way 750 of knowing which of the infinite possible ensemble members actually constituted the actual climatic history, we estimate the mean/median climatic history and our confidence in it. The uncertainty bounds shown in this study represent the impossibility (given a set of measurements) of determining 755 precisely the mean proxy value and hence, by extension, the mean paleoclimatic condition that it would represent. In order to be able to have a very narrow uncertainty range, efforts must be taken to reduce the various sources of error that contribute to the final proxy error. We discuss the possi-760 bilities and limitations of this in the next section.

Reduction of uncertainty
Let us take the example of the Lonar lake observations and ask: how can we reduce the final proxy uncertainty? For this, we have to look at the four factors that determine it. Among 765 these, the calibration uncertainty cannot be reduced until a more tightly constrained calibration curve is released, the proxy-depth variance is beyond our control, and the proxy measurement error is already set to zero in our analysis. Thus, we are left with the sole option of reducing the RM age 770 model error. This can be achieved with additional radiometric dating of the archive, or by incorporating layer counted segments of the record that have relatively less error. However, since we do not consider layer counted data in our approach presently, we will consider below the effect of adding more 775 radiocarbon dating points.
We might plan to make a few more measurements especially around those depths where the RM age model is not very precise, e.g., at around 700 cm (Fig. 6B). Still, a significant portion of the final uncertainty might also be due to the 780 intrinsic variance of the proxy along depth and we thus need to fully understand exactly how much of the final uncertainty is contributed by the age measurement errors. The highly non-trivial way in which the final uncertainty is related to the RM age model uncertainty (via the DWFs) makes it almost 785 impossible to find a precise analytical answer to questions of the type: if we make two 14 C age measurements at depths d 1 and d 2 with a maximum error of ε, by what fraction z will the uncertainty at calendar age t going to go down? We can nevertheless get some insight into how much error 790 is contributed by the age uncertainty by considering a simple thought experiment. Let us assume that we are able to reduce the RM age model uncertainty to zero by taking N error-free radiocarbon age measurements at the precise depths of proxy measurements. The variance of the DWF will then solely de-795 Figure 10. Contribution of age uncertainty to proxy estimation uncertainty. (a, c) The proxy records for Ca area (in a) and Al area (in c) for the original set of observations (red curves for the median, light blue area for 95 % confidence bands) along with the proxy records after making the RM age model uncertainty identical to zero (dark gray curves for the median, orange area for 95 % confidence bands). (b, d) The uncertainties for each time point (95 % confidence bands shown in a and c) and for the two cases: original observations (light blue), and after setting the RM age model error to zero (orange). (Color online.) Furthermore, even if age uncertainty is reduced to zero, the proxy records still differ a great deal from the records constructed by using the OxCal P-sequence model as shown in Fig. 7. Coming back to the issue of improving the Lonar proxy records with the help of additional measurements at around 700 cm, we first note that these depths would roughly correspond to calendar ages of around 4-5 kBP (see Fig. 6B, starting at around 700 cm, and going counterclockwise from the depth axis to the RM age model curve, to the calibration Nonlin. Processes Geophys., 21, 1093-1111, 2014 www.nonlin-processes-geophys.net/21/1093/2014/ curve and finally to the calendar age axis). From Fig. 6, we find that the region around 4-5 kBP shows almost no improvement from setting the age uncertainty to zero! This indicates that the Lonar lake radiocarbon age measurements are not the primary source of the final proxy uncertainties. Rather, it is more likely that the major part of the proxy uncertainties in the Lonar records are due to the proxy fluctuations in the depth domain and the calibration uncertainties. The thought experiment illustrates several points: -Even though the final uncertainty of proxy estimation is linked to the age uncertainty in a complicated manner, it is possible to understand the relative contributions of the age errors by setting them to zero.
-The final proxy uncertainty can, in some cases, be determined more by its own variability in the depth domain, rather than the age uncertainty.
-Given a set of observations and an RM age model, it is possible to obtain a limit to the precision with which the proxy can be estimated.
-The variability of a proxy signal is inherently linked to the kind of paleoclimate variations that it will allow to be investigated, and also to the level of precision with which such studies can be carried out.

Precise, error-free timescale
One important consequence of having an age-uncertain timescale for representing proxy records is that the intercomparison of records from different archives becomes difficult and ambiguous. In the approach outlined in this study, we overcome this difficulty because of the use of conditional probabilities. The use of conditional probability implies that every time we consider a particular calendar age and then estimate the DWF for it, we "fix" the calendar age precisely and then obtain the likelihoods of the proxy depths for that age. This means that the final proxy estimate we obtain is represented on a timescale that is error free; i.e., it is without any uncertainty. Such a notion of a precise timescale has already been introduced for speleothems in Breitenbach et al. (2012), where it is termed an "absolute" timescale that corresponds effectively to the time of deposition of the proxy material on the archive. They illustrate the utility of an errorfree timescale with the help of Monte Carlo age modeling approaches. In the present paper, we generalize this idea for all radiometrically dated archives, and also provide an analytical framework for it. Typically, if we represent paleoclimate proxies on an age-uncertain timescale, the interpretations of paleoclimate events are constrained by not being able to know when exactly an event took place in the past. This is overcome if the uncertainty is transferred from the age axis to the proxy axis, ensuring that the timescale of representation is always error free. An analogy to visualize this process is to think of the uncertainty as a bag of errors that can be carried either by the time axis or by the proxy axis. In paleo-investigations till now, this bag of errors had been left on the shoulders of the time axis, but we choose to transfer it to the proxy axis instead. This comes at a price, because in doing so, we are not certain any longer about the high-frequency variability of the proxy. Thus, it is not a question of whether this particular representation of proxies is more correct than the conventional age-uncertain one -the choice of representation is context dependent and is determined by the goals of the paleoclimatic investigation. One can conceive a study in which the interest in the high-frequency variability is outweighed by the problems of vagueness induced by an age-uncertain time axis. In such a scenario, it is reasonable to use an existing framework of age modeling that establishes a representative set of ages for each depth level of the core, provided the approximations used for dealing with the irregularly calibrated age distributions are reasonable.
A methodological advantage of having an error-free time axis is that time series analysis methods are more readily applicable to them. Although there has been recent work that was able to extract the climate spectrum (Mudelsee et al., 2009) as well as timescale-dependent trends with errors (Mudelsee et al., 2012), interpretations of time series analyses such as the construction of paleoclimate networks  have to be cautious of the time uncertainties that were inherent in the original data and that were left unresolved. Our approach provides a clear platform to carry out such analyses and interpret the results.

General comments
We note here a few additional points relevant to our approach that deserve attention.
1. The RM age model need not be monotonic if a nonmonotonic radiometric calibration curve is involved. This is apparent in Fig. 6, but we note that the true RM age model shown as a black curve in Fig. 5 is itself not monotonic. This is due to the fact that the up-down wiggles of the calibration curve carry over to the RM age-depth curve. This however does not imply that the true age model is non-monotonic.
2. In Fig. 7, the general trends of both curves are similar, with the only difference that the proxy curve obtained with the OxCal age model contains high-frequency variations. This indicates that the age-depth interrelations underlying these curves are fairly similar. We highlight this point as we do not expressly use an age model, nor do we involve assumptions about the sedimentation process. The MoTaBaR regression method used to get the RM age model is a data-driven regression method that is not specific to any kind of sedimentation process.
3. It is also important to note that our approach in its present form however presents several limitations.
www.nonlin-processes-geophys.net/21/1093/2014/ Nonlin. Processes Geophys., 21, 1093-1111, 2014 Firstly, it cannot deal with discrete proxy variables and with data sets in which the depth measurement errors are not negligible. Moreover, it assumes a perfect sedimentation record without hiatuses or age reversals, and is not equipped to deal with data sets possessing these features. It also does not consider the incorporation of additional dating information such as layer counting of well-laminated sections. It thus needs to be placed within a larger framework of proxy record construction, involving approaches such as COPRA, StalAge, clam, or OxCal, that can deal with such issues as well.

Conclusions
We present a Bayesian approach for the estimation of sedimentary proxy records as well as their associated uncertainty. This is a novel approach that, for the first time, details analytical relations between the different uncertainties involved in paleoclimatic proxy record estimation. We circumvent approximations typically required to deal with the irregularities of the age distributions of calibrated radiocarbon ages by finding the likelihood of depths for each calendar age rather than the other way round. We establish an RM age model to obtain the relation between a time point and its related set of depths, and avoid constructing an age model as is understood conventionally. We provide analytical expressions for posterior proxy probability distributions for any given calendar age that can be used for further analyses. We test our approach on two synthetic examples designed to mimic U/Th and 14 C dating methods, and demonstrate its validity. We then reconstruct groundwater inflow and surface erosion proxy records from the Lonar lake in central India.
Our analysis shows how the variability of the proxy in the depth domain is an intrinsic factor determining proxy uncertainty and how the final proxy uncertainty in turn reduces the resultant variability of the median proxy estimate. We extricate the contribution of the age uncertainty alone to the final proxy uncertainty from the rest of the involved errors. We show that for the Lonar lake set of observations, the age measurements are already quite close to the limit of precision in terms of the final proxy estimates. Such an exercise could be used to test the efficacy of a set of age measurements or that of the proxy itself for being a suitable record for the kinds of questions that the investigator wishes to pursue. Lastly, we successfully extend the notion of a precise timescale for representing proxy records to radiocarbon-dated archives.
Keeping in mind that the uncertainty (error) of any measurement is critical to the proper use of that measurement, our analysis provides a way of deriving the uncertainty of a proxy measurement at chosen points in time. This allows for a more critical, and still more valuable, understanding of how much, and how precisely, we know about the paleoclimate via proxy measurements. Even though our analysis does not focus explicitly on estimating other characteristics of the proxy such as its variability, power spectra, transitions, etc., we have sketched how it can provide a general direction in which such aspects can be estimated and the corresponding uncertainties of estimation quantified.
This study is not without its limitations. It needs to be part of a more expansive agenda of paleoclimate proxy record construction to be of practical utility. Moreover, it has the potential to be extended to estimate the proxy records for archives dated with procedures such as dendrochronological methods and luminescence techniques. In the case of dendrochronology, it would be possible to consider the time axis as the control variable (without error) and the depth assigned to the time increments as the source of error, and in the case of luminescence dating, a reasonable approximation of the error distribution around the reported luminescence ages would suffice to adapt the method suitably.
The RM age model involved in our approach is a critical step where further information could possibly be included. For instance, deposition models that are available for different types of paleo-archives, such as the P-, V-, and Usequence models of OxCal, could be combined with the radiocarbon age-depth measurements to constrain the RM age model uncertainties further. This can have a significant impact in reducing the final proxy uncertainty, especially in cases where the dating measurements are poor.
-the 14 C age data specify the conditional density of r given Z = z r k as P (r|z r k ) ∝ exp −(r − r k ) 2 /2 σ R k 2 .
-the proxy data specify the conditional density of x given Z = z x j as P (x|z x j ) ∝ exp − x − x j 2 /2 σ X j 2 .
We only state proportionalities (∝) here, taking care of proper normalization only at the end.

A2 Estimating depth-spanning weight functions
To answer the question, which among the proxy measurement depths are more likely than others to correspond to a given true age, we apply the law of total probability and Bayes' theorem to combine the calibration curve distribution P (r|t) and the output of the RM age model (which gives P (r|z x j ) from the measured data P (r|z r k )): P z x j |t = drP z x j |r P (r|t) = dr P z x j P r|z x j P (r) P (r|t).
Assuming all ages and depths are equally likely a priori, we use the (flat) prior distributions P (r) ∝ P (z x j ) ∝ 1, and see that P (z x j |t) is proportional to the weight . Al counts as a representative proxy for Lonar lake. Al counts (top panel, in blue) are strongly correlated to the Ti (orange), Si (green) and K (red) counts obtained from XRF scanning of the Lonar lake sediment. This correlation with the other elements, combined with the fact that the Al counts arise due to the catchment erosion forms the basis of choosing it as a representative proxy for this analysis. This choice also helps us to illustrate the impacts of proxydepth variability on the final proxy estimate uncertainties. (Color online.) advisable to treat such data with caution as XRF measurements give only a qualitative overview of the elemental abundances. However, as shown in the earlier study by Prasad et al. (2014), the Al counts are strongly correlated to the Si, Ti and K counts obtained from the same core (shown in Fig.B1).

1190
This determined our choice of taking Al as a representative proxy for the whole core. Even though the actual magnitude of the Al XRF estimates might be low, its variations mimic the variations of the Si, Ti and K counts, which have relatively higher magnitudes of XRF estimates. We note that this 1195 fact, combined with the study of Basavaiah et al. (2014) validates the use of the Al counts as a proxy for the surface erosion in the Lonar catchment area. Figure B1. Al counts as a representative proxy for the Lonar lake. Al counts (top panel, in blue) are highly correlated with the Ti (orange), Si (green) and K (red) counts obtained from XRF scanning of the Lonar lake sediment. This correlation with the other elements, combined with the fact that the Al counts arise due to the catchment erosion, forms the basis for choosing it as a representative proxy for this analysis. This choice also helps us to illustrate the impacts of proxy-depth variability on the final proxy estimate uncertainties. (Color online.)