Data assimilation for moving mesh methods with an application to ice sheet modelling

. We develop data assimilation techniques for nonlinear dynamical systems modelled by moving mesh methods. Such techniques are valuable for explicitly tracking interfaces and boundaries in evolving systems. The unique aspect of these assimilation techniques is that both the states of the system and the positions of the mesh points are updated simultaneously using physical observations. Covariances between states and mesh points are generated either by a correlation structure function in a variational context or by ensemble methods. The application of the techniques is demonstrated on a one-dimensional 5 model of a grounded shallow ice sheet. It is shown, using observations of surface elevation and/or surface ice velocities, that the techniques predict the evolution of the ice sheet margin and the ice thickness accurately and efﬁciently. This approach also allows the straightforward assimilation of observations of the position of the ice sheet margin.


Introduction
From lava flows to tumour growth, including water floods, many time-evolving processes can be mathematically modelled by moving boundary problems.Predicting their evolution accurately requires not only the estimation of the state variables of the system over a moving domain, but also the estimation of the location of the moving domain itself.In this paper we propose to combine data assimilation with a moving mesh numerical model to estimate both the domain and the states of moving boundary problems.The application of the combination is demonstrated on a one-dimensional model of a grounded shallow ice sheet.Our approach is particularly relevant to the prediction of the dynamics of ice sheets and glaciers.Future evolution of ice sheet boundaries is closely linked with sea level rise (Church et al., 2013) and ice sheets are now relatively well observed bodies (Vaughan et al., 2013).
Data assimilation (or DA) aims to combine available observations with model predictions in order to provide optimal estimates of the state of a system and an estimation of the uncertainty of these estimates.DA has been applied succesfully in various contexts and is routinely used in operational systems such as numerical weather prediction systems (Lahoz et al., 2010;Blayo et al., 2014).In particular DA has already been used with fixed-grid models in the context of moving boundary problems.In these cases estimates outside the moving domain are generally non-physical and need to be reanalysed.For example, negative sea-ice concentration and thickness can be obtained where there is no sea ice (Mathiot et al., 2012).The same problem appears with negative estimated thickness of ice sheets (Bonan et al., 2014).In both cases non-physical negative variables are reset to zero after data assimilation.Furthermore, with fixed grids, DA does not provide any explicit estimate of the extent of 1 Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg- -45, 2016 Manuscript under review for journal Nonlin.Processes Geophys.Published: 12 September 2016 c Author(s) 2016.CC-BY 3.0 License. the domain.This can be only done by interpolation.By combining DA with a moving mesh numerical model, this paper shows that the extent of the domain can be estimated explicitly and non-physical estimates do not appear.
Genuine moving mesh methods use a fixed number of mesh points whose movement can be generated by various techniques (Budd et al., 2009;Baines et al., 2011).In this paper, we model the evolution of a radially-symmetrical continental ice sheet with a moving mesh-point method based on conservation of local mass fractions (Baines et al., 2005(Baines et al., , 2011;;Partridge, 2013;Lee et al., 2015;Sarahs, 2016).Such a method has no requirement in terms of initial distribution for mesh points and has been applied in various contexts (for example chemical spreading, Lukyanov et al., 2012, andtumour growth, Lee et al., 2013).This paper proposes to adapt two popular DA schemes: a 3D-variational scheme (or 3D-Var, see e.g.Lorenc, 1986;Nichols, 2010) and an Ensemble Transform Kalman Filter (or ETKF, see Bishop et al., 2001;Hunt et al., 2007) to the estimation of the state of an ice sheet modelled using the moving point method detailed in Bonan et al. (2016).The approach is validated by twin experiments using available classical surface observations (surface elevation and surface velocity, see Vaughan et al., 2013).
Observations of the position of the moving boundary (see e.g.Dyke and Prest, 1987 for observations of continental margins in palaeoglaciology) are also assimilated by using a straightforward observation operator.The paper is organised as follows: in Sect. 2 we recall the key points of the moving point ice sheet model, in Sect. 3 we describe how to use the 3D-Var and the ETKF for our state estimation problem, in Sect. 4 and 5 we validate our approach by performing several twin experiments before concluding in Sect.6.
2 Moving-point ice sheet model

Ice sheet dynamics
We consider a single phase, radially-symmetric, grounded ice sheet (no floating ice), centred on the origin r = 0 of the radial coordinates.The origin is called the ice divide.
The geometry of the grounded ice sheet is described by its surface altitude, s(t, r), the ice thickness, h(t, r) and the altitude, b(r), of the fixed bedrock on which the ice sheet lies (see Figure 1).These quantities are linked through the relation (1) The position of the edge of the ice sheet r l (t), also known as the ice sheet margin, is implicitly determined by the Dirichlet boundary condition h(t, r l (t)) = 0 . (2) The evolution of an ice sheet is governed by the balance between the mass exchanges at the surface (snow precipitation and surface melting) and the ice flow that carries the ice from the interior of the ice sheet towards its margins.This is summarised by the mass balance equation where m(t, r) is the surface mass balance and U (t, r) is the vertically averaged horizontal component of the ice velocity in the sheet.In the numerical experiments (see Sect. 4 and 5) we use two different surface mass balances: a function that only depends on the radius r and a more complex surface mass balance which depends on the atmospheric temperature that evolves with the geometry of the ice sheet.Both surface mass balances are described in detail in Appendix A.
The velocity of the ice is derived using the Shallow Ice Approximation (Hutter, 1983), which leads to the following analytical formulation of the vertically averaged horizontal component of the ice velocity U (t, r): where s is given by Eq. ( 1) and the parameters involved in the Shallow Ice Approximation (SIA) are summarised in Table 1.

Moving-point method
The moving-point numerical method we use in this paper relies on the computation of point velocities and point locations.This type of method belongs to the family of velocity-based (or Lagrangian) methods (Cao et al., 2003).Here the velocity of mesh points is obtained by conserving local mass fractions (Baines et al., 2005(Baines et al., , 2011)).To calculate the velocity we first define the total volume of the ice sheet θ(t) as Assuming that the flux of ice through the ice sheet margin is zero, its rate of change θ depends only on the surface mass balance, We now define the relative mass fraction µ(r) relative to the moving point r(t).Since the density of ice ρ i is assumed constant, volume fractions and mass fractions are equivalent and The velocity of the moving point r(t) is defined implicitly by keeping µ(r) constant in time.By differentiating Eq. ( 8) with respect to time using Leibniz' integral rule, we obtain the velocity of every interior point One of the points is dedicated to the static ice divide r = 0, while another point tracks the position of the margin r l (t), which moves at the velocity (Bonan et al., 2016) Once the velocity of each moving point has been obtained from Eq. ( 9) or Eq. ( 10), the moving points are moved in a Lagrangian manner using the explicit Euler scheme The total mass θ(t) is updated in the same way using θ(t) from Eq. ( 7).Finally, the ice thickness profile is updated by differentiating Eq. ( 8) with respect to r giving 4 Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg- -45, 2016 Manuscript under review for journal Nonlin.Processes Geophys.

Numerical model
From the equations detailed in Sect.2.2, a finite difference algorithm is derived (see Bonan et al., 2016, for the full algorithm).
The mesh consists of n r moving nodes with the positions No further assumption is made on the spatial distribution of the moving nodes.At each node ri there is an associated ice thickness h i and a fixed mass fraction µ i .By construction, µ 1 = 0, µ nr = 1 and the ice thickness at the ice sheet margin The user provides the initial mesh and the ice thickness at mesh points in order to initialise the numerical model.From these quantities, the initial total mass and the mass fractions are calculated by discretising Eq. ( 6) and Eq. ( 8) using the following 3 State estimation of a system modelled with a moving mesh We now recall the basics of data assimilation before explaining how to adapt the 3D-Var and the ETKF methods to our context.
We then clarify the form of the observation operator for various types of observations that we assimilate.

Data assimilation
We consider data assimilation in a discrete dynamical system evolving in time.We denote by x k the vector of size n x describing the state of the system at time t k .For example, in our numerical ice sheet model, ice thickness at mesh points are elements of the state vector.The state x k is propagated forward in time to a time t k+1 by the nonlinear model M k,k+1 .Assuming the model is perfect, we have Observations are available at times t k and are related to x k through the equation where y k is a vector of p k observations taken at time t k , H k is the, possibly nonlinear, observation operator and ε k is the observation error vector, which is assumed to be unbiased (zero mean) with covariance matrix R k .
The objective of DA is to provide an optimal estimate x a k of the system, called the analysis, by combining observations with information derived from the model.We consider in this paper two different DA schemes: a 3D-Var scheme and an ETKF.

3D-Var
3D-Var (see e.g.Lorenc, 1986;Nichols, 2010) aims to provide the optimal estimate x a k by minimising the cost function where x b k is a prior, or background, estimate of the state of the system (generally obtained by propagating forward in time the previous analysis x a k−1 with Eq. ( 16)).The error in the prior estimate is assumed to be unbiased with covariance matrix B k .
We take the observation operator H k to be linear around x b k , meaning that where H k is the linearisation of the observation operator about the background x b k .Under this assumption, the cost function has an explicit minimum where In theory the true background error covariance matrix B k should be updated at each time step.However, this process is extremely expensive for real-time applications and, instead, we use a matrix with a simplified structure specified by the user.
We will see in the numerical experiments (Sect.4 and 5) how setting B k appropriately is essential in order to obtain good estimates.

Ensemble Transform Kalman Filter
The Ensemble Kalman Filter (EnKF) introduced by Evensen (1994) aims to approximate the Extended Kalman Filter using a Monte Carlo method.At each time step, the state of the system is represented by an ensemble of N e realisations The state estimate is given by the ensemble mean and the state error covariance matrix by the ensemble covariance matrix where X k is the anomalies matrix defined as From the ensemble covariance matrix we can define the matrix Corr that contains an estimate of the correlation between the state variables to be where [Corr] i,j and [P e,k ] i,j denote the entry in the i-th row and j-th column of Corr and P e,k , respectively.
The forecast step propagates the ensemble from time t k to t k+1 with the nonlinear model M k,k+1 .For the analysis step we use the efficient Ensemble Transform Kalman Filter (ETKF) introduced by Bishop et al. (2001) and follow the implementation of the algorithm given by Hunt et al. (2007).
The ETKF may generate ensembles of analyses with underestimated spread, which can lead to the divergence of the filter.
We use an inflation procedure (Anderson and Anderson, 1999) here to avoid this potential degeneracy.In the rest of the paper the inflation factor is denoted by the parameter λ infla .
In the twin experiments performed in Sect. 4 and 5 we use a large number of ensembles to avoid producing spurious correlations in P e,k .Therefore, no localisation has been employed in this paper.
3.2 Form of the state vector in the moving mesh case Traditionally, in a data assimilation scheme the state vector includes all the physical variables of the given dynamical system.
For a fixed-grid numerical method the state variables are defined at fixed spatial positions.For example, for a grounded ice sheet modelled with a fixed-grid method (and assuming every parameter is perfectly known), the unknown variables are the ice thicknesses located at known positions (see e.g.Bonan et al., 2014).
In contrast, the primary characteristic of a moving-point method is that the numerical domain evolves in time.The positions of the nodes evolve jointly with the state variables according to the dynamical system equations and can be updated using the assimilation scheme.We therefore include the positions of the points in the state vector.As a consequence we define the state vector x as follows Estimates obtained by combining DA with this formulation of x using a moving-point numerical model provide more information on the state of the system than if we were using a fixed-grid method.
In particular, for an ice sheet model, this approach gives us a direct estimation of the position of the ice sheet margin that cannot be obtained in fixed-grid methods without interpolation.In this case, we do not include in x the ice thickness at the margin h nr or the position of the ice divide r1 as both are fixed to zero.The DA schemes must, however, provide estimates with strictly positive ice thicknesses h i , i = 1, . . ., n r − 1, and a preserved order for node positions to respect the assumption of the moving mesh scheme.
This can be achieved with 3D-Var if the specified background covariance matrix B k in Eq. ( 21) is prescribed carefully.At (we drop the time index k for clarity) as where B h is the background error covariance matrix between the state variables, B r is the error covariance between mesh point locations and B rh includes the cross-covariances between errors in point locations and errors in state variables.The different components of the state vector are then updated by the following analysis step The most difficult step with this form of analysis is, in general, to set appropriately the cross-covariances in B rh that are needed for the update stage.For example, if either H h or H r is zero, a non-zero B rh matrix is the only way to correct estimates of both x h and x r .However, we will see in the next section that in our assimilation systems for the ice sheet model, the observation operator depends explicitly on both ice thickness variables and mesh node locations and, therefore, by setting B rh to zero we can still obtain good estimates.
For the moving-point ice sheet model, the DA analysis step updates both ice thickness variables and node positions, but the total mass and mass fractions have to be updated as well, since they are not preserved by the analysis (and there is no reason to preserve them).Therefore these quantities need to be 'reset' from the analysed state vector.This is easily done by using Eq. ( 14) and Eq. ( 15).The adapted 3D-Var scheme is performed according to the following steps: 1. Calculate a forecast of the state vector x b by using the previous analysis solution to initialise the numerical moving point model.
2. Use the analysis scheme (Eq.( 28) and Eq. ( 29)) to produce the analysis x a .
3. From x a , calculate the analysed total mass θ a and update the mass fractions µ a using Eq. ( 14) and Eq. ( 15).
4. Evolve the analysis solution using the numerical moving point model to the next time where observations are available.
The adapted ETKF roughly follows the same path as 3D-Var except that, at step 1, we calculate the forecast for each member of the ensemble and, at step 3, the total mass and mass fractions have to be updated for each member of the ensemble (they are different for each ensemble member).The strict positivity of ice thickness variables and the order required in Eq. ( 13) for node positions are ensured by appropriately setting the initial ensemble in the ETKF.

Type of observations assimilated
In the twin experiments performed in Sect. 4 and 5, we use three different conventional types of observations of an ice sheet system that are available in reality (see e.g.Vaughan et al., 2013).The first is direct observations of the ice thickness.Assuming that we have an observation of the ice thickness located at position r o , we define the associated observation operator as which is merely a piecewise linear interpolation operator.Note that H depends on both ice thickness variables h i and node locations ri .We also assimilate observations of surface elevation and surface ice velocity.We again use a piecewise linear interpolation operator as in Eq. ( 30).For observations of surface elevation, we have with For observations of surface ice velocity, we have except for u s,1 = 0.
We may also assimilate observations of the position of the ice sheet margin.Using a moving point method allows the movement of boundaries to be tracked explicitly.In our context, the position of the ice sheet margin is represented by rnr .As a consequence the observation operator for such an observation is defined by The operator is continuous and linear.This makes the assimilation of the position of the margin straightforward in comparison with the same assimilation with a fixed grid model (see e.g.Lecavalier et al., 2014).9 Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016-45,2016 Manuscript under review for journal Nonlin.Processes Geophys.
To demonstrate the efficiency of our DA approach, we perform twin experiments with two different configurations.In this section we consider experiments using an idealized system with a flat bedrock and the EISMINT surface mass balance detailed in Eq. (A1).

Experimental design
We first generate a model run with the moving point numerical model from known initial conditions.From this simulation observations are created with added error sampled from a Gaussian distribution.This run is used as a reference to measure the quality of the DA estimates.
We define the reference initial ice thickness profile by the function, where h 0 = 2000 m and r max = 450 km.This function gives a smooth interior profile with a steep snout at the ice sheet margin r max .This is in compliance with the physics involved in the ice sheet model and provides an initial state with a margin that is immediately in motion.The reference run is obtained with an initial mesh of n r = 28 points evenly spaced between r1 = 0 and rnr = 450 km.The model time step is ∆t = 0.02 yr, the bed elevation b is fixed to zero and the surface mass balance used is from the EISMINT benchmark (Eq.( A1)).The experiment starts at time t = 0 yr and ends at t = 2000 yr.The evolution of the reference ice thickness profile can be seen in Fig. 2.
Radius (in km) To evaluate the performance of our DA approaches, we compare the estimated ice thickness profiles with their reference counterparts.This is mostly done graphically.We also study the quality of the estimates of two variables: the ice thickness at the ice divide r = 0 and the position of the ice sheet margin.

Updating the ice thickness only
We begin by studying the performance of the DA schemes in the idealized configuration where we assimilate observations of ice thickness only.We start with an experiment using the 3D-Var algorithm in which only the ice thickness is updated at the assimilation times and the mesh point positions are not updated.
The background state is defined as follows: at initial time, the background ice thickness profile is set using the same profile as the reference (Eq.( 37)) but with h 0 = 2100 m (+5 % error from the reference) and r max = 472.5 km (also +5 % error), the background mesh consists of n r = 28 points, evenly spaced between r1 = 0 and rnr = 472.5 km at initial time, the model time step is ∆t = 0.02 yr.
As we are using a 3D-Var scheme in this experiment, the background error covariance matrix B needs to be prescribed at both times of assimilation (t 1 = 500 yr and t 2 = 1500 yr).In this first experiment we only update ice thickness variables so we set the background error covariance matrix for point positions B r and the cross-covariance matrix B rh to zero.We define B h the covariance matrix for ice thickness variables as with D h the diagonal variance matrix and C h the correlation matrix.
where [C h ] i,j denotes the entry in the i-th row and j-th column of C h , rb i the location of the i-th mesh point of the background state at the time of assimilation and L h is some correlation length scale to be fixed.The SOAR function is preferred to a Gaussian structure as the matrix C h is better conditioned for inversion in that case (Haben et al., 2011).We set L h to 100 km.This definition of B takes into account the flow dependency of the moving point locations, making our approach adaptive.
Figure 3 displays B h at assimilation times t 1 = 500 yr and t 2 = 1500 yr.As the distance between grid points increases in time in the experiment, the covariances tend to reduce between the two assimilation times.For example the covariance between the location of points rb Covariances between variables at distant locations tend to reduce between the two assimilation times.The distance between adjacent nodes also tends to be greater in the centre of the mesh than at the boundaries, leading to a decreasing covariance at t2 = 1500 yr in this area.
addition we note decreased correlations for points around the centre of the mesh due to a greater distance between adjacent nodes in the centre of the grid than at the boundaries.
The formulation of B forces the re-computation of the matrix at every assimilation time.This is a limiting factor of our 3D-Var approach, especially for high dimensional systems making it cost more than traditional 3D-Var for fixed-grid models in which B is only computed once.Nevertheless, this approach ensures that this moving-point framework produces positive estimates of ice thickness variables and a smooth interior profile in accordance with the physics of the system.Since we only update x h in this experiment, the position of the margin is not modified by our update.Nevertheless, by correcting the interior of the ice sheet, the forecast of the migration of the margin is improved (see central and right picture after t = 500 yr, Fig. 4), and at the second assimilation time, t = 1500 yr, the absolute difference between the position of the margin before analysis and its reference position is only 5.6 km (instead of 15.9 km without DA).

Updating ice thickness variables and node positions
We now use 3D-Var to update both ice thickness variables and node locations.The definitions of B h and B rh remain the same as in the previous experiment, but we set the covariance matrix for node positions B r to be B r = D where L r is a correlation length scale fixed to 100 km.The formulation of B r aims to ensure that the order between mesh points defined by Eq. ( 13) is preserved by the 3D-Var algorithm.Since the distance between nodes evolves in time, it is even more important than in the previous case to use a flow-dependent background error covariance matrix B. divide (r = 0) is reduced from 100 m to 60.2 m by the 3D-Var analysis at time t 1 = 500 yr, which is similar to the previous experiment.However, we now obtain at t 1 = 500 yr a very accurate ice thickness profile close to the margin and its estimated position has an absolute error of only 0.2 km.This shows that the estimated position of the ice sheet margin can be accurately corrected by only using standard observations (no observation of the position of the margin is involved in this experiment).At the second time of assimilation at t 2 = 1500 yr, the estimate is degraded, however, as a result of using fixed variances in the matrix B. This behaviour is discussed further in Sect.4.5.
In these experiments we have specified a fixed form for the the background error covariance matrices.We next show, using an ETKF, how the covariances are expected to evolve with time and the effects of this on the assimilation.

Using an ETKF
We now perform the same experiment as before except that we now use an ETKF.The key question is how to generate the initial ensemble composed of N e members.The easiest way is to add noise to a background state sampled from a Gaussian law N (0, B) with B the background error covariance matrix defined in Eq. ( 27).
In this experiment we generate an initial ensemble of N e = 200 members using: the same background state used in the experiments detailed in Sect. 4. To avoid this problem we have decided just to reduce the variance for the position of points near the ice divide using Eq. ( 41).
The new ensemble mean has, at the initial time, an estimated position of the margin of 472.9 km with an estimated standard deviation of 22.8 km (where the true value at t = 0 a is 450 km).
We do not use any inflation in this experiment (λ infla = 1).
Results are summarised in Fig. 6.At the first time of assimilation t 1 = 500 yr, the analysis step corrects the ice thickness profile well.The estimate of the ice thickness at r = 0 is of the same quality as in the previous experiments (absolute error of 46.9 m) and the estimate of the position of the margin is reduced from 483.1 km (forecast mean with estimated standard deviation 18.9 km) to 468.8 km (analysis mean with estimated standard deviation 7.1 km).The estimate obtained by the ETKF is in accordance with the true value (which is within the ensemble spread) and the absolute error of 7.5 km is of the same 14 The ETKF provides information on the covariance structures for ice thickness variables and mesh point positions.We display estimated standard deviations and an estimate of the correlation matrix Corr (see Eq. ( 25)) in Fig. 7 for the analysis ensemble at time t = 500 yr.close to the margin in this case.The standard deviation for the position of the margin is reduced from 18.9 km to 7.1 km by the analysis.The ETKF also produces strong anti-correlations between ice thickness variables and node positions, meaning that where ice thickness variables become larger associated nodes need to retreat to fit the observations of ice thickness.

Comparing 3D-Var and the ETKF
We now compare the results from applying the 3D-Var and ETKF assimilation schemes in the case where we observe only the ice thickness.We focus on the accuracy of the estimated ice thickness at r = 0 and the position of the margin.
Figure 8 shows the evolution of the absolute errors in the estimates of the ice thickness at r = 0 and in the position of the margin for the ETKF and for 3D-Var, with and without node updates.All three methods provide improved estimates at the first analysis time (t 1 = 500 yr), leading to good forecasts up to the next assimilation time.We find that the ETKF tends to perform better than the variational approach and that for 3D-Var the estimates obtained by updating both ice thickness variables and node positions are generally better than those where only ice thickness variables are updated.For 3D-Var without node updates, the analysis at the second time of assimilation (t 2 = 1500 yr) of the ice thickness at r = 0 is unfortunately degraded relative to the forecast, but the estimated position of the margin is still improved by the second analysis.In the case where ice thickness and nodes are updated, the estimates of both ice thickness at r = 0 and the position of the margin are degraded at the second time of assimilation.This weakens the confidence in the forecast and we partially lose what we had gained from the previous analysis.This effect would not necessarily appear with another set of observations of thickness, but the experiment shows the sensitivity of 3D-Var to current observations because of the use of fixed variances in the prescribed covariance matrix B.
Using the ETKF assimilation scheme, where the covariance matrix fully evolves in time, is seen to improve the overall estimates.At each assimilation time, the errors in the estimated ice thickness and the position of the margin are decreased.
Notably we do not observe any degrading of the estimates at the second time of assimilation.This demonstrates the better 16 through the ensemble ensures that the ETKF is a more reliable scheme than 3D-Var.This improvement has a computational cost, however, as we now need to run the model N e times instead of once for 3D-Var.

Assimilating observations of the position of the margin
In this section we perform the same experiments as previously, but we now assimilate not only the same observations of ice thickness as before but also observations of the position of the margin.We consider only the case of 3D-Var with grid update and the ETKF.
Absolute errors for the estimates of the ice thickness at r = 0 and the position of the margin are shown in Fig. 9.In both cases assimilating observations of the position of the margin is beneficial to our estimates of the margin and of the ice thickness profile close to the margin.For example the estimated position of the margin at time t = 500 yr has an absolute error of 4.2 km for the ETKF (compared to 7.5 km previously).Not surprisingly it does not change the results for the ice thickness at r = 0.

Absolute error position of the margin
No Assim 3D-Var grid update ETKF mean Figure 9. Evolution of the absolute error of the estimated ice thickness at r = 0 and the estimated position of the margin when we observe the ice thickness and the position of the margin.We compare the absolute errors obtained when we use 3D-Var with correction of the position of grid nodes and when we use an ETKF.In both experiments the results are improved with respect to the position of the margin (compared to results detailed in Fig. 8).No improvement (nor degradation) is observed for the ice thickness at r = 0.
Adding observations of the position of the margin in the data assimilation system reduces the estimated standard deviations obtained with the ETKF for variables close to the margin.For example, the estimated standard deviation for the position of the margin is now 5.8 km instead of 7.1 km.Not surprisingly it has no influence on the standard deviation for variables close to the ice divide.The estimated correlation structure (not shown) is also not modified by adding observations of the position of the margin in the DA system.17 In this section we consider experiments using a more realistic configuration with a non-flat bedrock and an advanced surface mass balance, detailed in Appendix A2.

Experimental Design
We generate observations from a new reference run.We use a non flat fixed bedrock whose elevation is defined by the equation The reference run is generated from a realistic initial state obtained with the following steps: -Start with an ice sheet profile following eq.( 37) with h 0 = 2000 m, r max = 300 km and n r = 21 computational mesh points evenly spaced between r1 = 0 and rnr = 300 km.
-Run the numerical model with a fixed climate forcing where T clim = 4 o C until reaching the steady state (a 30, 000 yr run with a ∆t = 0.01 yr time step).
-From this steady state, run the numerical model with a linearly warming climate forcing from T clim = 4 o C with dT clim /dt = 0.02 o C.yr −1 for an extra 100 yr (∆t = 0.01 yr).The state obtained at the end of the run is the initial state (see Fig. 10).

Assimilating observations of surface elevation
We begin by studying the performance of the DA schemes where we assimilate only observations of surface elevations.
For 3D-Var the estimates are obtained using an initial background state defined as x b = 0.95 x ref (0) with a 5% smaller extent than the reference state.The flow-dependent background error covariance matrix B is defined as in Eq. ( 27).The matrix B h is defined as in Eq. ( 38 We first study the results obtained with the ETKF.At the end of the data assimilation window, t = 10 yr, the ice thickness profile is retrieved well everywhere by the mean of the ensemble and the reference profile is within the ensemble spread (see Fig. 11).We note that the estimate of the ice thickness at the ice divide is improved by the first analysis.After time t = 7 yr, however, the estimate is worsened by the analysis.This is because the ensemble spread is too small in that area.This can be fixed by taking a larger inflation parameter λ infla , but the estimates of other variables are then degraded.The estimated position (mean) of the margin at t = 10 yr is 1158.0km with an ensemble standard deviation of 3.1 km.In comparison to the reference value at that time, r = 1159.9km, we see that the ETKF with a large ensemble performs well.The quality of the estimates is also kept high during the forecast (from t = 10 yr to t = 20 yr).For example the absolute error on the position of the margin is kept below 2.5 km over this time window.
With respect to the covariance matrix, the estimates seem to show a similar behaviour to those of the experiment detailed in Sect.4.4 using the ETKF where observations of ice thickness are assimilated (see Fig. 12), but with a larger correlation length scale.The similarity can be explained by the similarity of the construction of the initial ensemble (the same structure for the background covariance matrix B used to sample the Gaussian noise added to the background state) and by the similarity of the observation operators for ice thickness and surface elevation.We now compare the ETKF with results obtained with 3D-Var.Absolute errors in the ice thickness at r = 0 and in the position of the margins are displayed for both cases in Fig 13 .As in previous experiments, the ETKF performs better than 3D-Var.For example, the absolute error for the ice thickness at the ice divide stays below 60 m after t = 1 yr for the ETKF.
By contrast, the absolute error for 3D-Var can be up to 125 m.The same statement remains valid for the absolute error in the position of the margin, which stays below 8 km for the ETKF after t = 2 yr, yet can be up to 20 km for 3D-Var.We remark that, since the background state is smaller than the reference state, 3D-Var does not assimilate all available data.Indeed the algorithm cannot incorporate observations outside the background domain because of the form of the observation operator (see Eq. ( 31)).This is not, however, the case for the ETKF, even if the ensemble mean has a smaller domain than the reference 20

Assimilating observations of surface velocity and position of the margin
We now consider assimilating observations of surface ice velocity and the position of the margin (if we only assimilate observations of surface ice velocity, the problem is undetermined).Again we want to compare the accuracy of 3D-Var and the ETKF using this new set of observations.We use the same background state, the same structure for B and the same initial ensemble as before.The observation operator for surface 21 Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016Discuss., doi:10.5194/npg- -45, 2016 Manuscript under review for journal Nonlin.Processes Geophys.Published: 12 September 2016 c Author(s) 2016.CC-BY 3.0 License.velocities is nonlinear (see Eq. ( 34)).For that reason, even if the ensemble is large, inflation is mandatory for the ETKF.We take an inflation of λ infla = 1.10.If the inflation is taken any larger in this example, the ETKF analysis produces ensemble members with a non-ordered grid and the experiment cannot be pursued.
We first study the results obtained with the ETKF.At the end of the DA window, t = 10 yr, the ice thickness profile is retrieved well everywhere by the mean of the ensemble, except near the ice divide r = 0 (see Fig. 14).This is due to the relatively large uncertainty of surface velocity observations near the ice divide compared to the reference value at the same point (here σ o us = 30 m.yr −1 and the reference surface velocity near the ice divide is below 0.1 m.yr −1 ).The estimated (mean) position of the margin at t = 10 yr, is 1144.7 km with an ensemble standard deviation of 12.1 km.This is an absolute error of 15 km, so is worse than in the case where we observed the surface elevation, but assimilating these data still provides better estimates than those obtained with no assimilation.This comment remains valid for the forecasts obtained after t = 10 yr since estimates of the position of the margin are not degraded over the time window [10 yr, 20 yr].
Estimates of the standard deviations and covariances, as shown in Fig. 15, differ from those of the previous experiment (see Fig. 12 for comparison).We observe that the standard deviations for the node positions are smaller in the middle of the ice sheet than in the previous experiment, but near the margin these are larger.The reduction in the standard deviation for ice thickness variables close to the ice divide is less significant than in the previous experiment.This is due to the relatively large uncertainty of surface velocity observations near the ice divide compared to the reference value at the same point.We remark that assimilating observations of surface ice velocity together with the position of the margin leads to an increased correlation length scale for ice thickness variables and to a smaller correlation length scale for node positions compared to the previous experiment.Finally the cross-covariances have smaller anti-correlations and positive correlations appear between ice thickness variables in the interior of the ice sheet and between node positions close to the margin.These differ significantly from the case where we assimilate observations of surface elevation as a result of the difference in observation operators.surface ice velocities and the position of the margin in the advanced configuration.We compare the absolute errors obtained when we use 3D-Var with the correction of the grid-node positions and when we use an ETKF.The ETKF performs better than the 3D-Var with respect to the position of the margin, but 3D-Var seems to give better results for the ice thickness at r = 0.

Conclusion and Prospects
In this paper we have adapted standard data assimilation techniques (a 3D-Var scheme and an ETKF) to estimate the state of a 1-d ice sheet model using a moving point method.This is done by including both ice thickness variables and the location of mesh nodes in the state vector.The only requirement is to ensure that the update does not produce a non-ordered moving mesh.
This can be achieved either by using an appropriate flow-dependent background covariance matrix with large correlations between adjacent mesh points or by using an ensemble with the same properties.This combination has been validated with various twin experiments assimilating classical available observations for an ice sheet (ice thickness, surface elevation and surface ice velocity) and also observations of the position of the boundary.These twin experiments show in particular that: the form of the state vector allows the explicit tracking of boundary positions for moving boundary problems; this form also allows a straightforward and efficient assimilation of boundary positions (in this paper, the position of the margin);

Figure 1 .
Figure 1.Section of a grounded radially-symmetrical ice sheet.
time t k we decompose the background error covariance matrix B and the tangent linear matrix of the observation operator H 7 Nonlin.Processes Geophys.Discuss., doi:10.5194/npg-2016-45,2016 Manuscript under review for journal Nonlin.Processes Geophys.Published: 12 September 2016 c Author(s) 2016.CC-BY 3.0 License.

Figure 2 .
Figure 2. Ice thickness profile from the reference run in a simple case (flat bedrock, EISMINT surface mass balance from Eq. (A1)).The initial state follows the profile of Eq. (37) with h0 = 2000 m and rmax = 450 km.The reference run is obtained with an initial mesh of nr = 28 points evenly spaced between r1 = 0 and rnr = 450 km.From the reference run we generate observations of ice thickness and the position of the ice sheet margin at times t 1 = 500 yr and t 2 = 1500 yr.Observations of thickness are taken at each point except at the margin (so a total of 27 observations) with D h is simply set to σ b h 2 I nr−1 with σ b h = 100 m.The background error correlation structure follows a Second-Order AutoRegressive (SOAR) distribution with

Figure 3 .
Figure 3. Covariance matrices for ice thickness variables B h used by the 3D-Var at assimilation times t1 = 500 yr and t2 = 1500 yr.

Figure 4 .
Figure 4. Left: 3D-Var analysis at time t = 500 yr compared with the forecast and the reference when we update only ice thickness variables.The ice thickness profile is improved, especially between r = 100 km and r = 400 km.Centre: Evolution of the position of the margin with time.Even if the position of the margin is not directly updated, the trajectory of the margin is corrected as a result of the ice thickness update.Right: Evolution of the position of grid points with time.The trajectory of each grid node is corrected after each analysis, as is the margin.
and C r a correlation matrix.The matrix D r is set to to σ b r 2 I nr−1 with σ b r = 22.5 km and C r follows a SOAR distribution with

Figure 5 .
Figure5.Left: 3D-Var analysis at time t = 500 yr compared with the forecast and the reference when we update ice thickness variables and node locations.In contrast to the results shown in Fig.3, the ice thickness profile is substantially improved close to the margin.Centre:Evolution of the position of the margin with time.The estimates are of very good quality even if the margin is not observed directly.Right:Evolution of the position of mesh points with time.The trajectory of each node is corrected by each analysis, as is the margin.
2 and 4.3, -B h defined by Eq. (38) with the diagonal matrix D h = σ 2 h I nr−1 , σ b h = 100 m, C h defined by Eq. (39), L h = 100 km, -B r taken as D with C r defined by Eq. (40) with L h = 100 km and the diagonal matrix D r defined as [D r ] ii = min σ b r , α ri i = 1, . . ., n r − 1 (41) with σ b r = 22.5 km and α = 0.2, -B rh set to zero.Note that the definition of B is slightly different from the previous experiment as we choose different diagonal variances.The change is because of the high probability of generating useless initial meshes with negative radii using D r = σ b r 2 I nr−1 as the background standard deviation σ b r is larger than the background position of the first points (for example rb 2 = 17.5 km).

Figure 6 .
Figure 6.Left: ETKF analysis at time t = 500 yr compared with the forecast and the reference.The ice thickness profile is improved over the whole domain and the reference profile is within the ensemble spread.Centre: Evolution of the ice thickness at r = 0 with time.The estimates are of very good quality and the estimates seem to converge towards the reference value at the end of the study.Right: Evolution of the position of the margin with time.The ETKF provides consistent estimates and the reference value is always within the ensemble spread.

Figure 7 .
Figure 7. Standard deviations and correlation matrix Corr estimated from the ETKF analysis ensemble at time t = 500 yr when we use only observations of ice thickness.Auto-correlations between ice thicknesses are located in the top left corner of Corr, auto-correlations between node positions in the bottom right corner.The rest of the matrix depicts the cross-correlations.

Figure 8 .
Figure8.Evolution of the absolute error of the estimated ice thickness at r = 0 and the estimated position of the margin when we observe only the ice thickness.We compare the absolute errors obtained when we use 3D-Var without and with correction of the position of grid nodes and when we use an ETKF.

Figure 10 .
Figure 10.Initial state used to obtain a 20-year reference run under a warming climate as detailed in Sect.5.1 with nr = 21 grid points and a non-flat bed.The reference is obtained by running the model over 20 years from the initial state with a time step ∆ t = 0.01 yr and the same linearly warming climate forcing (with T clim = 6 o C at initial time t = 0 yr, and T clim = 6.4 o C at t = 20 yr).Over the 20-year run, the geometry of the ice sheet stays relatively similar to the geometry of the initial state due to the slow dynamics

Figure 11 .
Figure 11.ETKF results for the advanced configuration where observations of surface elevation are assimilated over the first 10 years and a forecast is made for 10 further years.Left: ETKF analysis at time t = 10 yr compared with the reference.Centre: Evolution of the ice thickness at r = 0 with time.Right: Evolution of the position of the margin with time.

Figure 12 .
Figure 12.Standard deviations and correlation matrix Corr estimated from the analysis ensemble at time t = 10 yr in the advanced configuration where we observe surface elevation.The matrix Corr has the same structure as B defined by Eq. (27).Both standard deviations and correlation structures are similar to Fig. 7.

Figure 13 .
Figure13.Evolution of the absolute error of the estimated ice thickness at r = 0 and the estimated position of the margin in the advanced configuration where we assimilate surface elevations over the first 10 years.We compare the absolute errors obtained when we use 3D-Var with the correction of the position of grid nodes and when we use an ETKF.The ETKF performs better than the 3D-Var for both variables.

Figure 14 .
Figure 14.ETKF results for the advanced configuration where observations of surface ice velocity and the position of the margin are assimilated over the first 10 years and a forecast is made for the following 10 years.Left: ETKF analysis at time t = 10 yr compared with the reference.Centre: Evolution of the ice thickness at r = 0 with time.Right: Evolution of the position of the margin with time.

Figure 15 .Figure 16 .
Figure 15.Standard deviations and correlation matrix Corr estimated from the analysis ensemble at time t = 10 yr in the advanced configuration where we observe surface ice velocity and the position of the margin.The matrix Corr has the same structure as B defined by Eq. (27).Both standard deviations and cross-correlation structures are different from those shown in Fig. 12.

Table 1 .
Parameters involved in the computation of the vertically averaged horizontal component of the ice velocity (Eq.4).

Table 2 .
List of parameter values used for the parameterised surface mass balance o C.m −1