Continuing dynamic assimilation of the inner region data in hydrodynamics modelling: Optimization approach

In meteorological and oceanological studies the classical approach for finding the numerical solution of the regional model consists in formulating and solving the Cauchy-Dirichlet problem. The related boundary conditions are obtained by linear interpolation of data available on a coarse grid (global data), to the boundary of regional model. Errors, in boundary conditions, appearing owing to linear interpolation may lead to increasing errors in numerical solution during integration. The methods developed to reduce these errors deal with continuous dynamic assimilation of known global data available inside the regional domain. Essentially, this assimilation procedure performs a nudging of large-scale component of regional model solution to large-scale global data component by introducing the relaxation forcing terms into the regional model equations. As a result, the obtained solution is not a valid numerical solution of the original regional model. In this work we propose the optimization approach which is free from the above-mentioned shortcoming. The formulation of the joint problem of finding the regional model solution and data assimilation, as a PDE-constrained optimization problem, gives the possibility to obtain the exact numerical solution of the regional model. Three simple model examples (ODE Burgers equation, Rossby-Oboukhov equation, Korteweg-de Vries equation) were considered in this paper. The result of performed numerical experiments indicates that the optimization approach can significantly improve the precision of the sought numerical solution, even in the cases in which the solution of Cauchy-Dirichlet problem is very sensitive to the errors in the boundary condition.

points of the grid are traditionally used in these methods. The problem is generally solved by using a proper numerical scheme. However, both initial and boundary values, which are obtained from measurements or other model outputs, contain errors, which can reach 30% from the true values. 25 For some problems, the solution can be very sensitive to these errors. At the same time some values of the sought solution at some points of the inner region, along with initial and boundary data, are often available. These additional data also have errors. The question arises: Is it possible to improve the accuracy of the solution of the Cauchy-Dirichlet problem by using these additional data (which may contain errors)? For some well-known equations of mathematical physics and dynamic 30 meteorology we will show here that the use of additional information on the solution values within the integration region can noticeably improve the accuracy of the sought solution.
An improvement of the solution accuracy of the Cauchy-Dirichlet problem is extremely important in meteorology because this is the problem of regional weather forecast. A distinction between two types of atmospheric forecast models needs to be done. These are global models, making forecast 35 for the whole earth, and regional models, which produce a forecast for a limited region. One of basic differences between these model types is the grid resolution. Global models use a low resolution space grid and regional models operate on a more dense mesh. The reason for the existence of these two model types is, basically, computational. Even nowadays it is not possible in acceptable time period to integrate global models with the detailed physics and with the space resolution of regional 40 models. It is worth noticing that global and regional models include different physical processes.
Global models describe large-scale slow-time varying processes, with the time period of more than 3 hours and the space scale larger than 60 km. Regional models can simulate the evolution of mesometeorological fast processes (small cyclones, storms, tornados), with the time period of less than 3 hours and the spatial extent between 2 and 60 km. 45 These models are described mathematically by systems of nonlinear partial differential equations. To solve the system corresponding to a global model we need only the initial condition and bottom and upper boundary conditions, as the domain is the whole sphere. To get the solution of a system corresponding to a regional model we also need lateral boundary conditions. Generally, the lateral boundary conditions for a regional model are obtained from the global model. Global model data 50 corresponding to the domain of a regional model lateral boundary are interpolated into a regional mesh and then are used for finding of the Cauchy-Dirichlet problem solution.
We should notice that the lateral boundary conditions obtained from the global model do not provide any information on the structures of the scales smaller than the size of the mesh of the global model. In fact, global models do not distinguish a local meteorological phenomenon of characteristic 55 length scale smaller than 30 km, because the space step is greater than this value. On the other hand, the parts of space and time spectrums of a regional model solution which correspond to largescale structures and long-period processes are in worse accordance with observations than those of a global model. This is related to the fact that a regional model does not have the information about the phenomena that occur outside its domain and therefore cannot describe with necessary precision 60 (because the lateral boundary data are in low resolution discrete space and time grid) the impact of these phenomena on the evolution of processes inside the integration domain.
For example, the weakly regular, low frequency oscillations of the sea surface temperature near the Peru coast, known as El Niño and La Niña events, have a large influence on the climate of Brazil. The circulation regimes over the Brazilian territory noticeably differ during these events. As 65 a result, the precipitation rate in the South of Brazil during the La Niña periods is two times smaller than during the El Niño periods. Regional models driven, for example, by the Reanalysis data do not reproduce with confidence this distinction between the El Niño and La Niña regimes. The reason of this inconsistency is, probably, poor space and time data resolution on the regional model boundaries.
Mathematically, it means that we are solving boundary value problems with significant errors on the 70 boundaries and the solution is sensitive to these errors.
The spectral nudging technique is one of methods proposed to use additional data from the inner domain in order to reduce boundary error influence and to improve a solution of initial-boundary value problem (Waldron et all., 1996;Storch et all., 2000;Kanamaru and Kanamitsu, 2005). This method supposes incorporation of the largest internal modes of some meteorological variables from 75 observations or from a driving model into the regional model solution. However the use of the spectral nudging technique requires inserting additional forcing terms into the evolution equations of a regional model. Hence the original model is corrupted and its new solution may not be close to the exact sought solution.
In this work we propose a method which is free from the above mentioned shortcoming. Firstly, 80 in Section 2 we give the formulation of the Cauchy-Dirichlet problem for a regional model and the general formulation of an optimization problem, and we show how the latter (with the accessibility of some additional conditions) can be applied for finding the solution of the former. In Section 3, for some simple equations, we show how the optimization approach can be used to obtain a solution of the initial boundary value problem. The equations considered here include the ordinary differential 85 equation of Burgers, the one-dimensional, linearized Rossby-Oboukhov partial differential equation, and the partial differential equation of Korteweg -de Vries. We also compared the sensitivity of the solutions, obtained by the traditional and the new approach with respect to the errors in the boundary conditions. In Section 4 we discuss the obtained results. Appendix A provides more detailed description of the optimization procedure. 2 Numerical solution of regional problems

The numerical forecast model
We formulate here the Cauchy-Dirichlet problem as it is formulated for the regional weather forecast.
To find a unique solution for the regional model equations we need the lateral boundary condition for entire time interval on which the solution is seeking. We obtain these data from the solution of an outer model which, as a rule, is a global model. We don't know an exact solution of the outer model (if we did, our problem already would have been solved), but we can find its approximate solution in the discrete form: (1) Here ∆ t is the evolution operator of the discrete model, Ψ is the vector of the prognostic functions, will be the solution of (1). We suppose that this solution is sufficiently close to the solution of hypothetical ideal model which exactly describes the real atmospheric processes.
The regional model is located in a closed area with boundary S and its discrete representation can be written in the same manner as for the global model where δ t is the evolution operator of the discrete regional model, G is the vector of the prognostic functions and F rd are discrete external forces, Y local is the initial condition, and G s is the boundary condition for the regional model.

100
Now, we can solve numerically the initial boundary value problem (2) using the solution Ψ sl of the global model (1) for getting the boundary conditions. The traditional approach to seek the solution of (2) consists in the interpolation of required data from Ψ sl on the regional grid for forming the initial and boundary conditions and then applying any numerical method to solve the model equations. But, as we have mentioned above, the boundary conditions contain errors which can strongly corrupt the 105 solution.
The use of the information obtained from the solution of the global model equations Ψ sl on the values inside the area of the regional model integration for all available time moments can help to overcome this difficulty.

Formulation of the optimization problem 110
We assume that the solution of regional model (2) cannot strongly differ from global model solution Ψ sl . Our objective is to find G sl on the fine (regional) mesh that satisfies the initial values and the equations of the regional model (2). The solution G sl also have to be as close as possible to the Ψ sl inside the regional model domain. We can formulate this aim as an optimization problem in the following manner (3) Here d(:, :) is the objective function, which represents the distance between G and Ψ sl vectors. In other words, we want to minimize the distance between the regional and the global model solutions under conditions that the regional model equations and the initial condition are satisfied. Note here that, when we talk about the solution G, we bear in mind the vector G = G(x, t) in a grid space of (x,t) -coordinates, where t corresponds to discrete time points of the regional model integration and 115 x to discrete mesh points in the regional model area.
There are different approaches to solve the optimization problem (3). For example, we can apply it to each time level, as it is done in finite-differences methods: For the given initial conditions (t 0 = 0), the solution is obtained at t 1 . Then, considering the solution at t 1 as the initial condition, the solution is found at t 2 and so on. On the other hand, we can consider the domain of definition 120 of the problem on the mesh including all time levels. In this case all available information from the global model for the period of integration will be taken into consideration simultaneously. We use here this approach.
3 Examples of application of the optimization method to the problems of small dimension 3.1 Burgers' equation 125 We shall demonstrate an application of the optimization method to problems of small dimension.
As a first example we consider the supersensitive boundary value problem for the Burgers' equation (Bohé, 1996): To get the analytical solution of this equation it is enough to integrate the left and right sides of the equation (4) over t: Then, after rewriting the foregoing formula as and integrating the left side over x and the right one over t, we can write the Burgers' equation solution as The graph of the analytical solution for equation (4) with the boundary condition x(0) = −1, x(1) = 1 and = 0.05 is presented in Figure 1. Numerical solution of this problem applying, for example, the shooting method coincides with great accuracy with the analytical one.
[Insert figure 1 here.] Let us choose several points from the analytical solution and slightly perturb them (till 5% from 130 its real value). This procedure models the boundary and inner domain data containing errors. Using these perturbed data for the boundary points x(0), x(1) we solve the boundary problem (4)  Here we consider the one-dimensional linear partial differential equation obtained during the linearization procedure of the Rossby-Oboukhov equation (Oboukhov, 1949): m is the Oboukhov scale, c 0 is the sound velocity, and U is the zonal wind, which is variated between 0 and 30 m/s.
For the periodic boundary condition where L is the size of integration area (we shall use L = 3 · 10 7 m), the solution of equation (5) can be written as where and A n and φ n are defined by the initial condition (Rossby, 1939).

155
For finding the numerical solution it is convenient to rewrite the equation in nondimensional form.
We choose the following scales S = 6 · 10 6 m, T = S/V = 6 · 10 5 s, V = 10 m/s. The dependent and independent nondimensional variables are defined as follows: , and equation (5) may be written in nondimensional form as where 1 b = S l0 = 2, β 0 = βS 2 V = 57.6 and U 0 = U V ∈ [0, 3]. For finite-difference discretization we use unconditionally stable scheme with truncation error Ø(∆x 2 , ∆t 2 ) given by At first, we generate a specific analytical solution (7) containing 85 modes. As a local model we consider the equation (10), with initial and boundary conditions, defined over [a, b] [0, L] (closed interval smaller than entire domain). For initial and boundary conditions we will use our specific form (see Fig. 4) of the global model solution (7). For simulating really encountered problems of atmospheric modelling we took the analytical solution in the points of the 165 the coarse grid with space step ∆x = 200 km and time step ∆t = 2 hour, and perturb its values randomly in such a way that a perturbation can reach till 30% of its exact value.
Primarily we find the solution of the forward Cauchy-Dirichlet problem for our local model. As In the next experiments the perturbed analytical solution on the coarse grid was used for formation of the boundary conditions. Figure 5 shows the exact solution and the solution with perturbed boundary conditions of the forward problem.

180
One can see that for the forward problem the numerical solution rather quickly diverges from the analytical solution due to the errors of the linear interpolation procedure in the border points. After 96 hours perturbed boundary solution has very faint resemblance with the analytical one.
In the second series of experiments we applied the optimization method to the local model. For this calculation we used all available global data (perturbed analytical solution on coarse grid) in the 185 inner local domain. Figure 6 shows the results of calculations. It is important to note, that, as in previous experiments, we chose the space ∆x = 100 km and the time ∆t = 3600 sec steps to be as large as possible.
[Insert figure 7 here.] In the optimization approach one can use larger steps than in the forward problem, because we 190 do not deal with a time-evolution problem, and there is no accumulation of numerical errors at each time step. In Figure 7, one can see that for space and time steps many times larger than those that were used in the forward problem (∆x = 100 kilometers and ∆t = 3600 seconds) we have far better agreement with the analytical solution. The decrease of time and/or space step in the optimization approach does not appreciably improve the solution, only the smallest details are 195 slighly better reproduced.
As the local mesh in the optimization approach is rather coarse, the average CPU time needing for solving the problem is smoller than for forward problem (0.5 seconds) in spite of greater computational complexity. Also we have to pay attention on a very weak sensitivity to the global data perturbations. The impact of every individual perturbation on the optimization solution is very small.

200
For example, increasing errors in randomly perturbed global data up to 60% does not strongly affect the solution as one can see in Figure 8. albeit seeking the solution on entire space-time grid, is not very numerically expensive, because the space, and, principally, the time steps can be chosen much larger than in the forward problem.

Korteweg-de Vries equation
Now we consider the Korteweg-de Vries equation: and find its numerical solution applying both the traditional forward method and the optimization method.
For the finite-difference discretization we use the following implicit numerical scheme (Furihata, 1999), which possesses the properties of total energy and mass conservation: As a numerical scheme is non-linear for obtaining of numerical solution of a forward problem we use the Newton method.
Following the steps of previos example, we choose as a reference model the analitical solution with cofficient b = a, γ = 2 and module of elliptic function m = 0.995. The Figure 9 represents 215 the solution on domain [−5, 15] at time t = 0.
[Insert figure 9 here.] As a local model we consider the equation (11)  [Insert figure 11 here.] In Figure 11 one can see that the forward problem with small errors in the boundary conditions can produce inacceptable numerical solution.

230
Now, we apply the optimization method to the local model using additional avaliable for inner points information from global model solution. Figure 12 shows the solution of the optimization problem. As in previous experiment we chose the space (∆x = 0.2) and the time (∆t = 0.002) steps as large as possible.

235
It can be clearly seen the advantage of the optimization approach in this case. The difference between numerical and exact solution is very small in comparison with the forward problem calculations.

Discussions and conclusions
The results of the numerical experiments presented here indicate that the optimization approach can

Formulation of optimization problem
We can express our problem as the following non-linear optimization problem with equality con- where V represents the global data on the regional mesh and h(u) = [h 1 (u) h 2 (u) . . . h m (u)] T is a vector of the discretizations of regional equations at each point of space-time mesh of the regional model.

255
A usual optimization technique to solve this kind of problem is to apply Newton's iteration method to the system of nonlinear equations arising from first-order necessary conditions, known as Karush-Kuhn-Tucker (KKT) conditions (Nocedal and Wright, 1999). For instance, the KKT conditions for the problem (A1) are where is the Jacobian matrix of the constraint function and λ represent the vector of Lagrange coefficients.
Each step of the Newton iteration associated with system (A2) is defined as follows: where J represents the Jacobian matrix of the system (A2) and ∇ 2 h i (u), i = 1, . . . , m, are Hessian matrices of the constraints.
The Jacobian matrix (A4) is a saddle point matrix and there are many methods that can be applied with assosiated linear system (Benzi, Golub and Liesen, 2005). But first, note that the derivative of the discretization vector of the constraints is described by a sparse matrix with block-diagonal structure, because this derivative is calculated at every point of time-space mesh. On the other hand, 260 the first part I+ i λ i ∇ 2 h i (u) of Jacobian matrix (A4) of the system (A2) includes the calculation of the Hessians of the constraints discretization vector, which is a computationally expensive procedure because calculations have to be made for every Newton iteration. The resulting matrix is generally dense. To accelerate the calculations and reduce the consuming of the computer memory we use the following Jacobian approximation: The use of the approximation B k for J(u k , λ k ) strongly simplifies the procedure of finding the solution of the linear system (A3) (Benzi, Golub and Liesen, 2005).
Acknowledgements. The authors gratefully acknowledge Prof. R. Laprise for constructive discussions with whom helped to make clear a number of diffcalties of studing problem and Dr. T.A. Tarasova for useful comments on the text that significantly improved the manuscript.