The Stochastic Calculus Reformulation of Data Assimilation : on Scale

The understanding of uncertainties in Earth observations and simulations has been hindered by the spatial scale 10 problem. In addition, errors caused by spatial scale change are an important part of uncertainty in data assimilation (DA). However, these uncertainties exceed the abilities of current theory. We attempted to address these problems. First, measure theory was used to propose a mathematical definition such that spatial scale is the function output of a measure given that its referential element and representative region are confirmed, and then the Jacobian matrix was used to describe the change of scale. Second, the scale-dependent variable was defined to further consider the heterogeneities. Last, under the Bayesian 15 framework of DA, the scale-dependent uncertainty was studied based on stochastic calculus. The result formulated the scaledependent error in DA. If we restrict the scale to a one-dimensional variable, the variation range of this type of error is proportional to the scale gap. Furthermore, assuming the observation operator is stochastic, we developed an example by introducing the stochastic radiative transfer equation. The new methodology will extend the recognition of the uncertainty in DA and may be able to address the scale problem. 20


Introduction
Scientists have devoted considerable attentions to understanding uncertainties in Earth observations and simulations.
However, uncertainties caused by spatial scale changes have yet to be fully addressed.Empirical studies have been conducted only recently (Crow et al., 2012;Gruber et al., 2013;Hakuba et al., 2013;Ran et al., 2016;Huang et al., 2016).
Studies found that the absolute bias between point-and footprint-scale measurements of soil moisture is highly significant (Li and Liu, 2016) and that the uncertainty increases with increases in the difference between spatial scales (Famiglietti et al., 2008).This uncertainty problem, also known as the "spatial scale problem", may result in significant errors in understanding geographical parameters.
The uncertainty caused by the scale problem (for brevity, the term "scale" will be used to refer to spatial scale below) is actually derived from the strong spatial heterogeneities (Miralles et al., 2010;Li, 2014) and irregularities (Atkinson et al.,3 formed geographical parameters with respect to the change of their scales.Second, the DA system is reformulated based on the above concepts and stochastic calculus.In Sect.4, we establish the Bayesian description of DA with time-and scaledependent stochastic processes and investigate the impact of scale changes on the posterior probability of the system state. An example of this impact is also presented by introducing a stochastic radiative transfer equation (SRTE).Comments and prospects are summarized in the last section.

Scale
An accurate conception of scale is extremely beneficial for the study of Earth observations and simulations.However, except for an intuitive conception (Goodchild et al., 1997) and certain qualitative classifications of scale (Vereecken et al., 2007), it is difficult to find a strict definition of scale.This gap partially reflects the complexity of this problem and requires corresponding mathematical tools to illuminate "scale".Because scale is highly dependent on the geometric features of observational region, we first introduce several basic concepts of measure theory.

Measure and measure space
Let  be an arbitrary nonempty space.
is a σ field  of subsets of  and satisfies (i)  , and the empty set  ; (ii) A implies its complementary set c A  ; (iii) 12 AA  ,, implies their union 12 AA    .


, μ is then called the measure.In addition, if 1 () μ  , μ can be replaced by the probability measure P , and if μ is finite, P can be calculated as ,, μ  and   ,, P  are the measure space and probability measure space, respectively.Additional details on measure theory can be found in literatures (Billingsley, 1986).
In measure theory,  is typically a one-dimensional space, and a multi-dimensional space is formulated by the product spaces.In this study, to avoid an overly complex expression, we define the observational region as a one-dimensional 4 measure space with 2 R  , in which the abstract information about the region of interest is conveyed.In this case, the measure correspondingly turns into 0 ) , , [ , ) , which should also obey the countable additivity.This simplification will not negatively affect the definition of scale.

Definition of scale
To connect the measure and scale, the following measures in Earth observation will be considered in advance.
, and probability measure   (ii) Measure of a single point measurement: This measure is permanently zero under the condition of the Lebesgue measure.However, in the real world, the representative space of a single point measurement cannot be ignored, and thus, any single point measurement with an absolute zero measure cannot exist.
(iii) Measure of the remote sensing observations: Assuming that a pixel of a remote sensing image is a rectangle, , the measure of an Earth observational region Here, the measure is an equivalent expression of the spatial resolution of the remote sensing image.
(iv) Measure of the disc measurement: Due to the conditions (ii) and (iii) of , it is better to count the circumscribed square of the disc as the subsets of  .Using the same notation as in measure (iii), this measure of disc measurement is . This measure is proportional to the area of the disc.
(v) Measure of the footprint measurement: The representative space of this measurement is any bounded closed domain.
Just like measure iv  , we also see the area of the domain as its measure  

  
, where the Jacobian determinant


, respectively.The value of the measure depends strongly on the design of the referential element.Therefore, if the two-dimensional referential element is counted as , respectively.In addition, by introducing the discrete point measure (Billingsley, 1986), which is based on a distinctive referential element, the measure of a single point measurement will be greater than zero, but other measures are invalid.Additionally, if the spatial coordinates are replaced by latitude and longitude coordinates, all of the above measures should be adjusted accordingly.
According to the above analysis, we define "scale" as the function output of a measure given that its referential element and representative region are confirmed.That is, let 0 A be a given rectangular referential element, and  is a measure; for any representative region A  , the scale is . From a geometric perspective, the measure refers to the shape of the observation region, and the scale further indicates the size of the region; therefore, the scale increases with increases in the value of the measure.
We further demonstrate the change of scale.
12 AA  , , if there are two measures 12


, and two different scales  If two scales follow the one-dimensional law, they are geometrically similar.This law simplifies scale as a onedimensional variable that corresponds to the scale differences between most remote sensing images with various spatial resolutions.For example, for the measure of remote sensing observations, if   20 , 02

 
. Then, these two scales  are in a one-dimensional law.
Figure 1 demonstrates the relationships between the referential element, measure and scale.The measure space is regularly divided by a referential element defined with unit area.Let   ( , )   , represent the standard scale.In the next section, the system state and observations of DA will be presented as stochastic processes to investigate their evolution with the one-dimensional infinitesimals ds and dt . 3 In Figure 1, if the pixel intensity is the value of a geographical parameter, the entire region is heterogeneous.A variable represents an ensemble average in a specific observational region with a specific scale.Therefore, the variables of 1 C and 3 C are not equal considering that their observational regions are different, and the variables of 2 C and 3 C are also different because the scale changes.The former introduces the variables that vary with location, and the latter is a scaledependent variable.Therefore, from the Earth observational perspective, a variable is a nonlinear and heterogeneous mapping function from the observational region to the set of real numbers at a given time.
The dynamic process of the variable clearly depends on time, and we further assume that the variable changes with spatial scale in view of the scale effects.Furthermore, due to the uncertainties in Earth observations and simulations, it is reasonable to assume that the variable is random both in time and on the spatial scale span.Therefore, if the statistical properties of the variable are available, we can construct an explicit stochastic equation of the variable.
We introduce the time-dependent Ito process (See Sect.4.2) as follows to define the variable process: where () pt , () qt and () Wt are the transition probability, volatility and Brownian motion (See Sect.4.2), respectively.
Similarly, the variable is supposed to evolve via a stochastic process for which the dynamic process and uncertainty are allowed to vary with scale: where () s  and () s  are the scale-based transition probability and volatility, respectively.It should firstly notice that time is one-dimensional unidirectional, but the scale go forward or back on the condition that the scale follows the one-dimensional law.Secondly, the Brownian motion is valid only on the condition that the observation instrument is perfectly calibrated, which indicates that the instrument error of the geophysical parameter is only associated with random error and is free of systematic bias.

8
The variable of a forecasting model in DA can be expressed as Eq. ( 2), and if the dynamic model is differentiable, ( ) ( ) , where the dynamic process is equivalent to the transition probability, and both k t and the subscript k denote discrete time.If the future of the system state is based solely on the present state, i.e., , the variable process is Markovian.In the analysis step of DA, the system state does not pertain to time, and we assume that scale has a quantifiable impact on the uncertainties in this step; thus, both the system states and observations can be defined by Eq. ( 3).In the following sections we will try to prove this hypothesis.

Reformulation of DA
The dynamic and observation operators of a DA system are typically deterministic models.Those models can be effectively understood in a specific case study but are not applicable in an integrated theoretical study.We offered a solution to this problem by introducing stochastic calculus.Recently, nonlinear dynamic models based on stochastic differential equations (SDEs), such as the double-well model and stochastic Lorenz model (Miller et al., 1999;Eyink et al., 2004), were studied in assimilation.In addition, a DA study based on stochastic processes (Miller, 2007) has been proposed.However, the theorems of calculus based on stochastic processes (or stochastic calculus) are distinct from ordinary calculus; therefore, a theoretical exploration of DA is essential.
In this section, based on the above definition of scale and the variables formed by the stochastic process, we reformulate the expression of DA by employing basic stochastic calculus laws and further study how the scale influences the uncertainties of DA.

Bayesian analysis of DA
We introduce the widely accepted Bayesian theorem of DA (Lorenc, 1995;Miller, 2007;Li and Bai, 2010) to investigate its time-and scale-dependent uncertainties.In the following, we suppose that there is only one system state and one observation in DA, and the results will be extended to a multi-variable DA at the end of this section.
Consider a nonlinear forecasting system described by where Xt and () represent a forecasting operator transiting the system state from the discrete time 1 k  to k , the true system state with prior PDF   pX , and the white noise of the forecasting system at time k , respectively.In addition, if a new observation is available at time However, Eq. ( 5) cannot result in the same conclusion as Eq. ( 2) because the observation operator is not a time-dependent dynamic model.Based on the reviews in Sect. 1, when the observation operator maps the system state to the observation space, a remarkable change may arise if the observation and state space scales differ.Thus, we assume that the observation operator   H is scale dependent in this situation.
Based on Bayes theorem, the posterior PDF of the state conditioned on the addition of a new observation into the system is where  

Basic knowledge of stochastic calculus
In this section, we have incorporated some necessary concepts and theorems of stochastic calculus.All the classic theorems have been introduced without proofs; their detail derivations can be found in literatures (Itô , 1944;Karatzas et al., 1991;Shreve, 2005).
Compared with the ordinary differential and integral calculus, stochastic calculus is defined for integrals of stochastic processes with respect to stochastic processes, such as Brownian motion.Brownian motion is one of the simplest stochastic processes, and describes the physical phenomena such as random perturbations and irregular movements of microscopic particles subject to random forces.Brownian motion is named after the Scottish botanist Robert Brown, who observed continuous jittery motions of pollen suspended in water by a microscope.The Brownian motion W defined on a probability measure space   ,, P  is characterized as follows:

10
The last two conditions represent that 2 2 1 1 0 are independent Gaussian random variable.Additionally, because Brownian motion is based on probability measure space, W pertains to probability measure P .
Lemma 1: ), W s s  ; therefore, in the following content, we use Brownian motion with a parameter starting at 0 s to define the scale- dependent variables, and some classic expressions should be changed slightly.
The stochastic process Eq. ( 3) is the differential form of Ito process, the integral form of which is Theorem 1: For any Ito process defined as in Eq. ( 3), the quadratic variation accumulated up to s is Remark on Theorem 1: According to Lemma 1, Eq. ( 7) is and the integral form of the Ito process (3) is 00 0

Stochastic calculus for DA
Next, we deduce the stochastic calculus results for DA based on the above assumptions and theorems.We first assume that, Assumption 1: The measures of the system state and observation in DA obey the one-dimensional law defined in Sect.

2.2.
In the forecasting step, Assumption 2: The simulation units of the model equal the scale of the system state, and both are constant.
In the analysis step, Assumption 3: The parameters (including the system state and observation) and observation operator are scale dependent; only one observation is added into the DA system at a time, and the system states and observations at different times are scale independent.
Considering assumption 2, the forecasting step is explicitly free of scale, and thus, Eq. ( 2) is adequate to describe this step.
Based on assumption 3, the analysis step relates to scale; thus, some basic definitions should be presented in advance.
According to Eq. (3), the system state and observation in the analysis step are respectively expressed as operator and its parameters are both susceptible to scale.According to the review in Sect. 1, the parameters can be regarded as Earth observational data, and thus, they may vary with scale.The observation operator also depends on scale because it is a particular physical law.Even if the observation operator is scale invariant in essence, a scale-dependent parameter will again breed its sensitivity to scale.
Assumption 3 implies that when observational information is added in the analysis step, the system state and observation scales are invariant.Regardless of the scale-based transition probability of system state and observation, 0 () X s   and 0 () Y s   .In addition, because the noises are Gaussian, we have 1 Based on the above discussion, the differential and integral forms of the system state are In addition, for the observation, we have Assumption 1 suggests that the observation and system state spaces have the same probability measure, and thus, the Brownian motions in these two spaces are equivalent.Let Eq. ( 18) − ( 19), we obtain Equation ( 20) can be regarded as an Ito process, and its drift is The integral term in Eq. ( 21) is the difference of the first-order differential observation operator between the system state scale X s and the standard scale 0 s .This term illustrates that when mapping the system state to the observational space by introducing scale, the mapping should consider not only the function of the observation operator but also the first-order differential term.The former part is typically found in the literature, whereas the latter has been derived in this study for the The quadratic variation of ( 20) is which means that the uncertainty of the observation error includes both the difference between scales Y s and 0 s and the change of the observation operator from scale X s to 0 s .Therefore, from Eq. ( 21) and Eq. ( 22), we obtain X and Y are stochastic functions that depend on scale, and thus, the posterior PDF of the system state is scale dependent as well.
In particular, if The quadratic variation (22) can be further described by the scale ranging from In the same manner, if XY ss  , the quadratic variation of (20) becomes The significance of Eq. ( 20)-( 27) is that the effect of scale on the posterior PDF is identified quantitatively.In addition to the model error and instrument error, a new part of the uncertainty of DA has been discovered in the analysis step.The expectation of the posterior PDF may vary with the scale of the system state if Y is an indirect measurement of X , and the uncertainty of the drift depends on the difference between Y s and X s (based on Eq. ( 26) and Eq. ( 27)) or between 0 s , Y s and X s (based on Eq. ( 22)).In addition, if Y is a direct measurement of X (Eq. ( 24 and Eq. ( 25)), the expectation of the posterior PDF is the difference between Y and X , and the uncertainty is equal to the change of scale.Additionally, if the results are not derived from assumption 1, i.e., the measure varies randomly, the posterior PDF should be more complex because its integral path is an arbitrary curve.

An example: the stochastic radiative transfer equation (SRTE)
As a concrete instance of stochastic observation operator   , ( ) H s X s , the SRTE will be employed in the following discussion to further demonstrate how scale influences the uncertainties of the system state.
The SRTE is a stochastic integral-differential equation that describes the radiative transfer phenomena through a stochastically mixed immiscible media and develops an analytical or numerical method for finding the stochastic moments of the solution, such as the ensemble-averaged or variance of the radiation intensity (Pomraning, 1998;Shabanov et al., 2000;Kassianov et al., 2011).Consider the general expression of the SRTE, ( ] T are the radiation intensity, scattering integral term, emission source, coefficient of radiation direction, optical depth and brightness temperature, respectively.To tie into more substantial random optical properties of transfer media, such as absorption and scattering, the optical depth  is assumed to be stochastic.Because real transfer media is rarely homogeneous, we further suggest that optical depth is a scale-dependent Ito process as which causes the radiation intensity, scattering term and emission source to depend on scale as well. Regardless of the scattering integral term and emission source, the SRTE simplifies to a homogeneous linear differential equation, () () and its analytical solution is 0 ( ) ( ) Radiation intensity is a scale-dependent Ito process, for which the rate of the transition probability is   .According to Fubini's theorem (Billingsley,   1986)  obey the one-dimensional law, we have which proves that the product measure is also in a one-dimensional law.
The analysis for the single system state can also be applied to finite multiple states in the product measure space.However, the above results may not hold without the assumption that the system states k X are independent of each other.

Conclusions and outlook
In this study, we mainly addressed two basic problems: what is scale and how should the impact of scale changes in a DA framework be evaluated?Instead of the empirical and qualitative expression, we employed measure theory and stochastic calculus to define the scale and thus the evolution of uncertainty with respect to scale in DA.
The first problem began with an introduction to measure theory.We revealed that scale is the function of a measure, given that its referential element and representative region are confirmed.Because scale is related to the shape and size of a representative space, this definition regards scale as the measure value, and is according to the spatial transformations between different observation regions.We then defined the variable, which depends on scale, and this step should further consider the heterogeneities of geographical parameters.A variable consequently expresses the ensemble average of a geographical parameter at a specific scale.This study marked a connection between the concepts of measure and scale, and we expected that the problems of scale could be further understood by utilizing the completeness and rigorousness of measure theory.The definition of the change of scale is as important as that of scale because the uncertainties in Earth observations and simulations are partially caused by scale transformations.The change in scale was described using the Jacobian matrix, and it can be further simplified by the one-dimensional law to suit stochastic calculus.This simplification is reasonable for a large portion of Earth observation data, including remote sensing data, because the representative space changes of those data are geometrically similar.However, an in-depth and comprehensive exploration should be conducted in the future to describe other situations in the real world.evolve regularly based on assumptions 1-3.However, these situations may be more complex in the real world.Therefore, the above results, which are dominated by the structure of the Jacobian matrix and the integral path depending on scale change, will become intricate when adding concrete physical models (such as the SRTE).
This work contributes to the understanding of uncertainties in simulations and observations of land surface processes that to define and quantify scale.A scale problem is caused by the scale variations of dynamic processes and basic physical principles, and also from the spatial distributional heterogeneity and irregularity of geographical parameters.Thus, we conducted an integrated study that considered both the geometric transformation of an observation space and the variation in geographical parameters.This integrated study included all possible situations and predictably conforms to each scalerelated case study.However, a case study can be seen as a particular solution of the stochastic calculus equation, for which scales change and the scale-dependent Brownian motions evolve in its own integral paths.In addition, the uncertainties of case studies differ because their integral paths differ.
is difficult to calculate its area directly, we use double integration by substitution to change the variables; define the substitution functions  , x x u v  and   , y y u v , which must be injective and continuously first-order differentiable,

s
are then a one-dimensional law change.
be seen as an area calculation of an inscribed circle in a square region.However, the scale of 2 C is not equal to the two other scales because of their different areas.However, their scales are in a one-dimensional law because the measures are identical and the Jacobian matrices are diagonal.Similarly, we have 12 DD   ; their scales are also different but are in a one-dimensional law.

Figure 1 .
Figure 1.Diagram of the Relationships between the Referential Element, Measure, Scale and Variable.
Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg--35, 2016     Manuscript under review for journal Nonlin.Processes Geophys.Published: 30 June 2016 c Author(s) 2016.CC-BY 3.0 License.equation of DA (Eq.(6)), one can obtain the posterior PDF   distributions of the system state and observation.Based on Theorem 1 and Eq.(15that the variances exist.In addition, because Assumption 1 states that the measures are in the one-dimensional law, the scales vary in one-dimensional space, which results in Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg--35, 2016     Manuscript under review for journal Nonlin.Processes Geophys.Published: 30 June 2016 c Author(s) 2016.CC-BY 3.0 License.14 first time.This result prompted us to further consider the first-order differential of the observation operator when calculating the observation error.

.
The difference between (31) and the common Ito process is that there is a primitive function term.Therefore, the other parts of the integral term can be regarded as an evolution rate of the primitive function, and the uncertainty of the radiation intensity is more complex because it is related to not only the change of scale but also the primitive function.Integrating both sides of Eq. (31) yields the general solution of the radiation intensity, Suppose that the independent system states k X are the variables of the measure spaces   Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg--35, 2016     Manuscript under review for journal Nonlin.Processes Geophys.Published: 30 June 2016 c Author(s) 2016.CC-BY 3.0 License.18For the second problem, we reformulated the expression of DA using a scale-dependent stochastic variable and investigated how the change in scale influences the evolution of uncertainties in DA.The results formulated a new scaledependent error in DA and further supported previous qualitative knowledge that the analysis error is highly related to changes in scale.It is beneficial to understand the uncertainty under the condition that we separated the scale-dependent error from other errors.Earth observations and simulations can also be improved based on a better description of uncertainty.The results can be derived from the one-dimensional simplification of scale change, and the variables in DA are supposed to

A
theoretical exploration was conducted in this work, but the study of the scale problem and nonlinear and stochastic assimilation are far from complete.This reformulation is worthwhile because research on uncertainties in Earth observations and simulations cannot exclude scale-related errors.Geophys.Discuss., doi:10.5194/npg-2016-35,2016 Manuscript under review for journal Nonlin.Processes Geophys.Published: 30 June 2016 c Author(s) 2016.CC-BY 3.0 License.

Stochastic variables in DA Let
a real function   VA be the variable if it maps the measure space   onto R .A variable is the measurement of a geographical parameter in a specific region, and   , V s t R  denotes that the variable depends on time t and has a scale This Lemma is practical because scale is certainly greater than zero, which does not fit the definition of Brownian motion whereby the parameter should start at zero.The standard scale 0 .Remark on Lemma 1:* ( , there is the only one product measure n