Introduction
An accurate estimation of subsurface geological properties like permeability
and porosity is essential for many fields, especially where such predictions
can have a large economic or environmental impact, for instance prediction of
oil or gas reservoir locations. Knowing the geological parameters, a
so-called forward model is solved for the model state and a prediction can be
made. The subsurface reservoirs, however, are buried thousands of feet below
the Earth's surface and exhibit a highly heterogeneous structure, which makes
it difficult to obtain their geological parameters. Usually prior information
about the parameters is given, which still needs to be corrected by
observations of pressure and production rates. These observations are,
however, known only at well locations that are often hundreds of meters apart
and corrupted by errors. This gives instead of a well-posed forward problem
an ill-posed inverse problem of estimating uncertain parameters, since many
possible combinations of parameters can result in equally good matches to the
observations.
Different inverse problem approaches for groundwater and petroleum reservoir
modeling, generally termed history matching, have been developed over the
past years; e.g., implemented Markov chain Monte Carlo
methods with different perturbations and tested them on a 2-D reservoir
model; obtained reservoir parameter estimations using the
Gauss–Newton method; used the Levenberg–Marquardt method
to characterize reservoir pore pressure and permeability. A review of history
matching developments has been written by .
For reservoir models the terms “data assimilation” and “history matching”
are used interchangeably, as the goal of data assimilation is the same as
that of history matching, where observations are used to improve a solution
of a model. Ensemble data assimilation methods such as ensemble Kalman
filters were originally developed in meteorology and
oceanography for the state estimation. Now it is one of the frequently
employed approaches for parameter estimation in subsurface flow models as
well e.g.,. A detailed review of ensemble Kalman filter
developments in reservoir engineering is written by . An
ensemble Kalman filter efficiently approximates a true posterior distribution
if the distribution is not far from Gaussian, as it corrects only the mean
and the variance. For nonlinear models with multimodal distributions,
however, an ensemble Kalman filter fails to correctly estimate the posterior,
as shown by .
Importance sampling (IS) is quite promising for such models as it does not
have any assumptions of Gaussianity. It is also an ensemble-based method in
which the probability density function is represented by a number of samples.
One sample corresponds to one configuration of uncertain model parameters.
The forward model is solved for each sample and predicted data are computed.
The weight is assigned to samples based on the observations of the true physical system and the predicted data.
The drawback of IS is that it does not update the uncertain parameters, but
only their weight; thus, a computationally unaffordable ensemble is required.
In order to decrease this cost,
a family of particle filters has been developed where IS is supplied
with resampling, and a sample is called particle.
Significant work for parameter estimation using particle filtering has been
done in hydrology. used it to estimate model parameters and
state posterior distributions for a rainfall–runoff model.
compared an ensemble Kalman filter and a particle filter with different
resampling strategies for a rainfall–runoff forecast and found that as the
number of particles increases, the particle filter outperforms the ensemble
Kalman filter. employed particle filtering to correct the
soil moisture and to estimate hydraulic parameters.
The resampling in particle filtering is, however, stochastic. Ensemble
transform particle filter (ETPF) developed by is a particle
filtering method that deterministically resamples the particles based on
their weights and covariance maximization among the particles. ETPF has been
used for initial condition estimations and for parameter estimations in
chaotic dynamical systems with a small number of uncertain parameters
(Lorenz 63 model). It has not been applied, however, in subsurface reservoir
modeling for estimating a large number of uncertain parameters. In this paper
we employ it for estimating uncertain parameters in subsurface reservoir
modeling. ETPF provides the equations that are solved in the space defined by
the ensemble members. Therefore for comparison we employ the ensemble
transform Kalman filter (ETKF) developed by that also
transforms the state from the model space to the ensemble space, minimizes
the uncertainty in the ensemble space, and transforms the estimation back to
the model space.
In this paper we investigate the performance of ETPF and ETKF for parameter
estimation in nonlinear problems and compare them to IS with a large
ensemble. This paper is organized as follows: in Sect. we
describe IS, ETPF, and ETKF for parameter estimation. We apply these methods
in Sect. to a one-parameter nonlinear test case, where the
posterior can be computed analytically, and in Sect. to a
single-phase Darcy flow, where the number of parameters is 5 and 2500. In
Sect. we draw the conclusions.
Data assimilation methods
We implement an ensemble transform Kalman filter and an ensemble transform
particle filter for estimating parameters of subsurface flow. Both of these
methods are based on a Bayesian framework. Assume we have an ensemble of M
model parameters {um}m=1M; then, according to this framework,
the posterior distribution, which is the probability distribution
π(um|yobs) of the model parameters um
given a set of observations yobs, can be estimated by the
pointwise multiplication of the prior probability distribution
π(um) of the model parameters um and the conditional
probability distribution π(yobs|um) of the
observations given the model parameters, which is also referred to as the
likelihood function:
π(um|yobs)=π(yobs|um)π(um)π(yobs).
The denominator π(yobs) represents the marginal of
observations and can be expressed as
π(yobs)=∑m=1Mπ(yobs,um)=∑m=1Mπ(yobs|um)π(um),
which shows that π(yobs) is just a normalization
factor.
Ensemble transform Kalman filter
Assume we have initially an ensemble of M model parameters
{umb}m=1M, where “b” refers to a background
(prior) ensemble, which are sampled from a chosen prior probability density
function; then the ensemble Kalman estimate (or analysis)
{uma}m=1M is given by
uma=∑l=1Mdiagslm+ql-1Mulb,m=1,…,M,
where diag is a diagonal matrix, slm is the (l,m) entry of
a matrix S,
S=I+1M-1(Ab)TR-1Ab-1/2,
and ql is the lth entry of a column q:
q=1M-11M-S2(Ab)TR-1(y¯b-yobs).
Here I is an identity matrix of size M×M, 1M is a vector of size M with all ones,
y¯b is the mean of the predicted data defined by
y¯b=1M∑m=1Mymb,
Ab is the background ensemble anomalies of the predicted data defined as
Ab=(y1b-y¯b)(y2b-y¯b)…(yMb-y¯b),
and R is the measurement error covariance. To ensure that the
anomalies of analysis remain zero centered, we check whether
Aa1M=AbS1M=0,
given S1M=1M and
Ab1M=0. The model
parameters umb and the predicted data
ymb are related by ymb=h(umb), where h is a nonlinear function, and here we
assume that the function h is known.
Ensemble transform particle filter
In particle filtering we represent the probability distribution function
using ensemble members (also called particles) as in ensemble Kalman filter.
We start by assigning prior (background) weights
{wmb}m=1M to M particles and then compute new
(analysis) weights {wma}m=1M using the Bayes formula and
observations yobs:
wma=π(yobs|umb)wmbπ(yobs).
We assume that initially all particles have equal weight, i.e.,
wmb=1/M for m=1,…,M, and that the likelihood is
Gaussian with error covariance matrix R; then, from
Eq. () wma is given by
wma=exp-12(ymb-yobs)TR-1(ymb-yobs)∑j=1Mexp-12(yjb-yobs)TR-1(yjb-yobs),m=1,…,M.
In IS, which will be used in this paper as a “ground” truth, these weights
define the posterior pdf. The mean parameter for IS is then
u¯a=∑m=1Mumbwma.
It is important to note that IS does not change the parameters
u; it only modifies the weight of the particles (samples). Therefore
a resampling needs to be implemented for parameter estimation, which is
usually stochastic. Instead particle filtering has been modified using a
deterministic coupling methodology which resulted in an ensemble transform
particle filter of . ETPF looks for a coupling between two
discrete random variables B1 and B2 so as to convert the ensemble
members belonging to the random variable B2 with probability distribution
π(B2=umb)=wma to the random variable
B1 with uniform probability distribution π(B1=umb)=1/M. The coupling between these two random
variables is an M×M matrix T whose entries should satisfy
tmj≥0,m,j=1,…,M,∑m=1Mtmj=1M,j=1,…,M,∑j=1Mtmj=wma,m=1,…,M.
An optimal coupling matrix T* with elements tmj* minimizes the squared Euclidean distance
J(tmj)=∑m,j=1Mtmj||umb-ujb||2
and the analysis model parameters are obtained by the linear transformation
uja=M∑m=1Mtmj*umb,j=1,…,M.
Then the mean parameter for ETPF is
u¯a=∑m=1Muma1M.
We use the FastEMD algorithm of to solve the linear transport
problem and get the optimal transport matrix.
Remark. An important property of ETPF is preservation of imposed
interval bounds on ensemble members. Consider an ensemble of parameters
{umb}m=1M given by
umb=(ambbmbcmb)T,m=1,…,M,
where we assume all the parameters {amb}m=1M,
{bmb}m=1M, and {cmb}m=1M are
bounded between 0 and 1. Therefore, the following inequalities hold:
0<amin≤amb≤amax<1,m=1,…,M,0<bmin≤bmb≤bmax<1,m=1,…,M,0<cmin≤cmb≤cmax<1,m=1,…,M.
Now we assume two discrete random variables B1 and B2 have probability
distributions given by
π(B1=umb)=1/M,π(B2=umb)=wma,
with wma≥0, m=1,…,M, and
∑m=1Mwma=1. As ETPF looks for a matrix
T* which defines coupling between these two probability
distributions, each entry of this coupling matrix satisfies the conditions
given by Eqs. ()–(). These conditions
ensure that each entry of the coupling matrix will be non-negative and less
than 1. Since the analysis given by Eq. () is
uma=a1b(Mt1m*)+a2b(Mt2m*)+…+aMb(MtMm*)b1b(Mt1m*)+b2b(Mt2m*)+…+bMb(MtMm*)c1b(Mt1m*)+c2b(Mt2m*)+…+cMb(MtMm*),m=1,…,M,
these conditions lead to
0<amin≤ama≤amax<1,m=1,…,M,0<bmin≤bma≤bmax<1,m=1,…,M,0<cmin≤cma≤cmax<1,m=1,…,M.
Thus the coupling matrix bounds the analysis ensemble members to be in the
desired range. This is not observed in ETKF as the matrix S given
by Eq. () does not impose any of the non-equality and equality
constraints, so it results in values outside the bound.
Localization
All variations of ensemble Kalman filter and particle filter are limited by
the ensemble size, since, even if the dimension of the problem is just up to
a few thousands, a large ensemble size will make each run of the model
computationally very expensive. This limit of a small ensemble size
introduces sampling errors. To deal with this issue, localized ETKF (LETKF)
was introduced by and localized ETPF (LETPF)
by . More recent approaches to particle filter localization
include and .
For the local update of a model parameter um(Xi) at a grid point
Xi, we introduce a diagonal matrix C^i∈RNy×Ny in the observation space with an element
(C^i)ll=ρ||Xi-rl||rloc,
where i=1,…,n2, l=1,…,Ny, n2 is the number of model
parameters, Ny is the dimension of the observation space, rl denotes
the location of the observation, rloc is a localization radius,
and ρ(⋅) is a taper function, such as the Gaspari–Cohn function
by :
ρ(r)=1-53r2+58r3+12r4-14r5,0≤r≤1,-23r-1+4-5r+53r2+58r3-12r4+112r5,1≤r≤2,0,2≤r.
Then the estimated model parameter at the location Xi is
uma(Xi)=∑l=1Mdiagslm(Xi)+ql(Xi)-1Mulb(Xi),m=1,…,M,
where diag is a diagonal matrix, slm(Xi) is the (l,m)
entry of the localized transformation matrix S(Xi),
S(Xi)=I+1M-1(Ab)T(C^iR-1)Ab-1/2,
and ql(Xi) is the lth entry of the localized column q(Xi),
q(Xi)=1M-11M-S(Xi)2(Ab)TR-1(y¯b-yobs).
LETPF modifies the likelihood and thus the weights given by
Eq. () are computed locally at each grid Xi:
wma(Xi)=exp-12(ymb-yobs)T(C^iR-1)(ymb-yobs)∑j=1Mexp-12(yjb-yobs)T(C^iR-1)(yjb-yobs),m=1,…,M,
where C^i is the diagonal matrix given by
Eq. (). Then the estimated model parameter uja(Xi) at the grid Xi is given by
uja(Xi)=M∑m=1Mtmj*u(Xi)mb,j=1,…,M,
where tmj* is an element of an optimal coupling matrix T* which minimizes the squared Euclidean distance at the grid point
Xi,
J(tmj)=∑m,j=1Mtmj[umb(Xi)-ujb(Xi)]2,
which reduces LETPF to a univariate transport problem.
It should be noted that localization can be applied only for grid-dependent parameters.
Probability density functions for the one-parameter nonlinear
problem. Top: ETPF; bottom: ETKF. (a, d) Ensemble size 102;
(b, e) ensemble size 103; (c, f) ensemble size 104.
Prior is in red. The true pdf obtained by IS with ensemble size 105 is in
black.
Single-phase Darcy flow
We consider a steady-state single-phase Darcy flow model defined over an
aquifer of a 2-D physical domain D=[0,1]×[0,1], which is given by
-∇⋅(k(x,y)∇P(x,y))=f(x,y),(x,y)∈DP(x,y)=0,(x,y)∈∂D,
where ∇=(∂/∂x∂/∂y)T; ⋅
denotes the dot product, P(x,y) the pressure, k(x,y) the permeability,
f(x,y) the source term, which we assume to be 2π2cos(πx)cos(πy), and ∂D the boundary of domain D. The forward
problem of this second-order elliptical equation is to find the solution of
pressure P(x,y) for a given f(x,y) and k(x,y). We, however, are
interested in finding permeability given noisy observations of pressure at a
few locations.
We perform numerical experiments with synthetic observations, where instead
of a measuring device a model is used to obtain observations. We implement a
cell-centered finite difference method to discretize the domain D into
n×n grid cells Xi of size Δx2 and solve the forward model
with the true parameters. Then the synthetic observations are obtained by
yobs=L(P)+η,
with an element of L(P) being a linear functional of pressure, namely
Ll(P)=12πσ2∑i=1n2exp-||Xi-rl||22σ2PiΔx2,l∈1,…,Ny,
where n=50, σ=0.01, rl denotes the location of the
observation, and Ny=16, which is the number of observations. The
observation locations are spread uniformly across the domain D and η
denotes the observation noise drawn from a normal distribution with zero mean
and a standard deviation of 0.09. This form of the observation functional
and parameterization of the uncertain parameters given below guarantee the
continuity of the forward map from the uncertain parameters to the
observations and thus the existence of the posterior distribution as shown
by .
Five-parameter nonlinear problem
For our first numerical experiment with Darcy flow, we consider a
low-dimensional problem where the permeability field is defined by a mere
five parameters similarly to . We assume that the entire
domain D=[0,1]×[0,1] is divided into two subdomains D1 and D2
as shown in Fig. . Each subdomain of D represents a layer
and is assumed to have a permeability function k(X), where an
element of X is defined by Xi for i=1,…,n2.
Parameters a and b denote the thickness of the bottom layer on either
side, which correspondingly defines the slope of the interface. A parameter
c defines a vertical fault. The layer moves up or down depending on c<0
or c>0, respectively, and its location is assumed to be fixed at x=0.5.
Further, for this test case we assume piecewise constant permeability within
each of the subdomains; hence, k(X) is given by
k(X)=k1δD1(X)+k2δD2(X),
where k1 and k2 represent the permeability of the subdomains D1 and
D2, respectively, and δ is the Dirac function. Then the parameters
defining the permeability field for this configuration are
u=(abclog(k1)log(k2))T.
We assume that the true parameters are atrue=0.6,
btrue=0.3, ctrue=-0.15, k1true=12, and k2true=5. These parameters are used to create
synthetic observations. Figure shows the true permeability,
with dots representing the observation locations. Next, we assume that the
five uncertain parameters are drawn from a uniform distribution over a
specified interval, namely a,b∼U[0,1], c∼U[-0.5,0.5], k1∼U[10,15], and k2∼U[4,7].
As was pointed out in Sect. , ETPF updates the parameters within the
original range of an initial ensemble, while ETKF does not. Therefore a
change in variables has to be performed for ETKF so that the updated
parameters are physically viable. In order to be consistent, we perform the
change in variables for ETPF as well. As the domain D is [0,1]×[0,1], the parameters a and b should lie within the interval [0,1]. To
enforce this constraint, we substitute a according to
a′=loga1-a,a′∈R,
and similarly b is substituted by b′. Thus the uncertain parameters
are now u′=(a′b′clog(k1)log(k2))T.
True permeability of the five-parameter nonlinear problem with dots
representing the observation locations.
In Fig. we plot probability density functions for parameters
a (panels a–d), c (panels e–h), and log(k2) (panels i–l), as the
parameters b and log(k1) show similar results. The posterior obtained
by IS with ensemble size 106 is plotted as a black line and the true value
of parameters is plotted as a black line with crosses. The posterior of ETPF
is shown at the top and the posterior of ETKF at the bottom. ETPF and ETKF
used 103 (odd columns) and 104 (even columns) ensemble members. In
order to perform an objective comparison between the probabilities, we
compute the Kullback–Leibler divergence of a posterior π obtained by
either ETPF or ETKF and the posterior πIS obtained by IS:
DKL(πIS∥π)=∑i=1NbπIS(ui)logπIS(ui)π(ui)(ui-ui-1),
where Nb=20 is the number of bins. The Kullback–Leibler
divergence for parameters a, c, and log(k2) is displayed in the
titles of Fig. , where we observe that ETKF outperforms ETPF.
Probability density functions for the parameters a (a–d), c
(e–h), and log(k2) (i–l). The posterior obtained by IS with
ensemble size 106 is plotted as a black line and the true values of
parameters are plotted as black crosses. The posterior of ETPF is shown at
the top and the posterior of ETKF at the bottom. ETPF and ETKF used 103
(odd columns) and 104 (even columns) ensemble members.
u¯¯a and u¯¯a±u¯stda
w.r.t. ensemble size: (a) for the parameter a, (b) for b,
(c) for c, (d) for log(k1), and (e) for
log(k2). ETPF is shown in blue, ETKF in red, the true parameters in
black, and the mean of IS in magenta.
misfita,r-misfitb,r (a) and
REa,r-REb,r (b) w.r.t. ensemble size. ETPF is shown in blue, ETKF in red, and the zero level in
black. One circle is for one simulation.
In order to check the sensitivity of the results to the initial parameter
ensemble, we perform 10 simulations based on a random draw of an initial
ensemble from the same prior distributions. We conduct the numerical
experiments for ensemble sizes varying from 10 to 103 with an increment
of 50. In Fig. we plot the true parameters, the mean
estimated by IS, the mean u¯¯a, and the
spread u¯¯a±u¯stda of estimated parameters averaged
over 10 simulations:
u¯¯ia=110∑r=110u¯ia,r,u¯stda=110∑r=1101M-1∑m=1M(ui,ma,r-u¯ia,r)2,whereu¯ia,r=1M∑m=1Mui,ma,r,r=1,…,10,
M is ensemble size, i=1,…,5 is the parameter index, and the
superscript a is for analysis. We observe that all the methods including IS
have a bias in the estimations of geometrical parameters, which is due to a
small number of observations. ETPF and ETKF perform comparably in terms of
mean estimation, though some are better estimated by ETKF and others are
better estimated by ETPF. Comparing the error in pressure of the mean
parameters we observe that the methods are equivalent (and thus not shown),
which is a manifestation of the ill-posedness of the problem. In
Fig. we see that the spread from ETPF is smaller than from
ETKF for each parameter. Both methods are slightly underdispersive as the
spread-to-error ratio is below 1. For ensemble size 103 ETKF gives (0.950.880.880.970.98) and ETPF gives (0.920.810.840.990.86) for
(abclog(k1)log(k2)). Thus ETKF gives a better ratio for all the
parameters but log(k1).
We compute an average of the relative error over all parameters
REa,r=15∑i=15|u¯ia,r-uitrue||uitrue|,r=1,…,10,
and the data misfit
misfita,r=(y¯a,r-yobs)TR-1(y¯a,r-yobs),r=1,…,10,
after data assimilation. The same metrics are computed before data
assimilation and denoted by a superscript b. In Fig. a–b
we plot (misfita,r-misfitb,r) and
(REa,r-REb,r), respectively, for each
simulation r as a function of ensemble size. ETPF is shown in blue and ETKF
in red. Black line is at zero level. Positive values of the differences mean
an increase in either data mismatch or relative error after data
assimilation. We observe a data misfit decrease for both ETPF and ETKF except
at an ensemble size 10. RE does not always decrease for ETPF: for some
simulations ETPF is at zero level or slightly above it, while for ETKF the
sole exception is at an ensemble size of 10.
High-dimensional nonlinear problem
Next, we consider a high-dimensional problem where the dimension of the
uncertain parameter is n2=2500. The domain D is now not divided into
subdomains. However, unlike in the previous test case, here we implement a
spatially varying permeability field. We assume the log permeability is
generated by a random draw from a Gaussian distribution
N(log(5),C). Here
5 is an n2 vector with all 5. C is
assumed to be an exponential correlation with an element of C
being
Ci,j=exp(-3(|hi,j|/v)),i,j=1,…,n2.
Here hi,j is the distance between two spatial locations and v is the
correlation range which is taken to be 0.5. For the log permeability we use
Karhunen–Loeve expansions of the form
log(kj)=log(5)+∑i=1n2λiνi,jZi,forj=1,…,n2,
where λ and ν are eigenvalues and eigenfunctions of C,
respectively, and the vector Z is of dimension n2 iid from a
Gaussian distribution with zero mean and variance one. Making sure that the
eigenvalues are sorted in descending order, Zi∼N(0,1) produces log(k)∼N(log(5),C). The uncertain
parameter is thus u=Z with the dimension n2=2500.
Mean, minimum, and maximum over 10 simulations after data
assimilation for the data misfit (a), RMSE (b), and
variance (c). ETPF is shown in blue and ETKF in red.
misfita,r-misfitb,r (a) and
RMSEa,r-RMSEb,r (b) w.r.t. ensemble size. ETPF is shown
in blue, ETKF in red and zero level in black. One circle is for one
simulation. For ETPF % of simulations that result in
(RMSEa-RMSEb)>0 and a linear fit as a function of
ensemble size are shown in (c).
We perform 10 different simulations based on a random draw of an initial
ensemble from the prior distribution. We conduct the numerical experiments
for ensemble sizes varying from 10 to 103 with an increment of 50. We
compute the root mean square error (RMSE) of the log permeability field:
RMSEr,a=log(k‾a,r)-log(ktrue)Tlog(k‾a,r)-log(ktrue),r=1,…,10,
and variance
variancer,a=1M-1∑m=1M(log(kma,r)-log(k‾a,r)T⋅log(kma,r)-log(k‾a,r),r=1,…,10.
We also compute the data misfit for each simulation after data assimilation
by Eq. (). In Fig. we plot mean, minimum, and
maximum over 10 simulations after data assimilation for the data misfit
(left), RMSE (center), and variance (right). ETPF is shown in blue and ETKF
in red. We observe that ETPF is underdispersive compared to ETKF as particle
filters are highly degenerative compared to Kalman filters. Misfit given by
ETPF is smaller than the one given by ETKF for almost all simulations at
ensemble sizes greater than 150. The RMSE by contrast is larger.
Log permeability field with dots representing the observation
locations. Truth is shown in (a) and mean obtained by IS with ensemble size
105 in (d).
Mean obtained with ensemble size 103 by ETPF shown in (b)–(e) and by ETKF in (c)–(f),
where (b)–(c) are at the smallest RMSE and (e)–(f) are at
the largest RMSE over simulations. The corresponding RMSE is given in
parentheses.
In Fig. a–b we plot (misfita,r-misfitb,r) and
(RMSEa,r-RMSEb,r), respectively, as a
function of ensemble size for a simulation r=1,…,10. The superscript
“b” is for the metrics before data assimilation and the superscript “a”
is for the metrics after data assimilation. ETKF always provides a decrease
in both the data misfit and RMSE except at ensemble size 10. ETPF gives a
decrease in the data misfit though an increase in RMSE, which indicates that
ETPF overfits the data. However, as the ensemble size increases, this happens
less often, as can be seen in Fig. c, where we plot for
ETPF a percentage of simulations that result in
(RMSEa-RMSEb)>0 and a linear fit as a
function of ensemble size.
In Fig. we plot log permeability fields. In
Fig. a the true permeability is shown with dots representing
the observation locations, and in Fig. d the mean permeability
field obtained by IS with ensemble size 105. The RMSE provided by IS is
32.62. In Fig. b–e and c–f we display mean permeability
fields obtained with ensemble size 103 by ETPF and ETKF, respectively. In
Fig. b–c we plot the mean log permeabilities for the smallest
RMSE over simulations, which is 30.51 for ETPF and 32.48 for ETKF. In
Fig. d–e we plot the mean log permeabilities for the largest
RMSE over simulations, which is 39.2 for ETPF and 33.87 for ETKF. We observe
that ETKF as well as IS provide smooth mean permeability fields that have
smaller absolute values than the true permeability. ETPF gives higher
variations of the mean permeability field and is in excellent agreement with
the true permeability for a good initial ensemble shown in
Fig. b. This means that ETPF sensitivity to the initial sample
is due to sampling error and that the spatial variability of ETPF is a result
of sampling error. It should be noted that IS with ensemble size 103 and
this good initial ensemble gives the RMSE 30.51 and the same mean log
permeability field as ETPF shown in Fig. b. However, IS does
not change the parameters, only their weights, while ETPF does change the
parameters. Therefore ETPF has an advantage of IS representing the correct
posterior but does not have its disadvantage of resampling lacking. In
Fig. we plot the variance of the permeability fields obtained
with ensemble size 105 by IS (Fig. d), with ensemble size
103 by ETPF (Fig. b–e) and ETKF (Fig. c–f).
Figure b–c are for the smallest RMSE and
Fig. e–f are for the largest RMSE. ETKF provides smoother
variance than ETPF due to smaller sampling errors.
Variance of log permeability fields: obtained with ensemble size
105 by IS (d), with ensemble size 103 by ETPF (b–e), and ETKF (c–f).
Variance at the smallest RMSE (b–c) and at the largest RMSE (e–f) over
simulations.
Squared error between the true and mean estimated modes for
Z1 (a), Z2 (b), and
Z3 (c) w.r.t. ensemble size. ETPF is shown in blue and
ETKF in red, with solid lines for median and shaded area for the 25th and
75th percentiles over 10 simulations. IS with ensemble size 105 is in
black.
The posterior probability density function of parameters
Z1 (a, d), Z2 (b, e), and
Z3 (c, f). The posterior obtained by IS with ensemble
size 106 is plotted as a black line and the true parameter as a black
cross. The posterior of ETPF is shown at the top and the posterior of ETKF at
the bottom. Both ETPF and ETKF used 104 ensemble members. The
Kullback–Leibler divergence is in parentheses.
In Fig. we show the squared error
(Z‾a-Ztrue)2 in blue
for ETPF and in red for ETKF for three leading modes Z1
(panel a), Z2 (panel b), and Z3 (panel c), where
solid line is for the median and shaded area is for the 25th and 75th
percentiles over 10 simulations. We observe that in terms of the estimation
of the three leading modes, ETPF outperforms ETKF. In Fig. we
plot the posterior of Z1 (left), Z2 (center), and
Z3 (right) obtained by IS with ensemble size 106 and by ETPF
(top) and ETKF (bottom) with ensemble size 104. The posterior of these
modes is roughly approximated by ETPF as shown in Fig. a–c.
ETKF provides a skewed posterior of the modes shown in
Fig. d–f, which was also observed in the one-parameter
nonlinear problem; see Fig. f. In order to perform an
objective comparison between the probabilities, we compute the
Kullback–Leibler divergence of a posterior π obtained by either ETPF or
ETKF and the posterior πIS obtained by IS according to
Eq. (). ETPF gives the Kullback–Leibler divergence 0.21, 0.42, and
0.6, and ETKF 0.16, 0.07, and 0.5 for the modes Z1,
Z2, and Z3, respectively. Thus ETKF gives a better
approximation of the true pdf.
Since first modes are well estimated by ETPF and last modes are not (not
shown), we use only three leading modes in the Karhunen–Loeve expansion
given by Eq. () when computing the estimated log permeability,
keeping the number of uncertain parameters the same, namely 2500. In
Fig. a we observe that ETPF outperforms ETKF for large
ensemble sizes independent of an initial sample. Moreover, ETPF does not
overfit the data anymore since RMSE always decreases after data assimilation
except at small ensemble sizes shown in Fig. b. In
Fig. we show the mean fields for the best and worst initial
samples of 104 size. ETPF gives an RMSE at the best sample of 31.1 and at
the worst sample of 32.98. By comparing it to 30.51 and 39.2 obtained using
the full Karhunen–Loeve expansions, we observe that the maximum RMSE over
simulations decreased substantially, while the minimum RMSE only slightly
increased. ETKF gives RMSE at the best sample 32.27 and the worst sample
33.23. (Compare to 32.48 and 33.9 using the full Karhunen–Loeve expansions.)
Thus ETKF slightly decreases both maximum and minimum RMSE over simulations.
ETPF is more affected by sampling noise at small scales, so using a truncated
representation of the fields significantly improves the results for ETPF.
ETKF filters out the small scales that are not observed and thus is less
affected by the truncation.
Using only three leading modes in the KL expansion. (a) RMSE
after data assimilation w.r.t. ensemble size with mean, minimum and maximum
over 10 simulations for ETPF shown in blue and ETKF in red. (b) %
of simulations that result in (RMSEa-RMSEb)>0 for
ETPF.
Same as Fig. but using only three leading modes in
the KL expansion.
Next we apply LETPF and LETKF. The optimal localization radius between 0.2
and 1.2 was obtained in terms of the smallest RMSE and shown in
Table . It should be noted that a smaller localization radius
for LETPF than for LETKF was also observed by , and it is
probably related to more noisy approximation of the posterior by LETPF than
by LETKF. In Fig. we plot misfit, RMSE, and variance.
Optimal localization radius for LETPF and LETKF at different ensemble sizes M.
M
10
110
210
…
910
LETPF
0.2
0.6
0.6
…
0.6
LETKF
0.2
1.2
1.2
…
1.2
At small ensemble sizes both LETKF and LETPF give smaller misfit, smaller
RMSE but larger variance than ETKF and ETPF. For large ensembles LETKF
performs worse than ETKF, which is due to the imposed range on localization
radius, meaning that 1.2 is not optimal. Comparing the performance of LETPF
to (L)ETKF we observe that at small ensemble sizes LETKF still outperforms
ETPF, but at large ensemble sizes LETPF performs now comparably to ETKF.
Moreover, LETPF overfits the data less often than ETPF: 40% against
90% for ensemble size 10 % and 0% against non-zero % for
ensemble sizes greater than 150 (not shown).
In Figs. – we plot mean and variance of the
log permeability field at ensemble size 103 for ETPF (panels b–e) and ETKF
(panels c–f) with localization at the smallest RMSE (panels b–c) and largest RMSE
(panels e–f) over simulations, which are 32.29 and 34.08 for ETPF and 32.92 and
34.09 for ETKF, respectively. We observe that localization decreases the
sampling noise and the spatial variability of the mean field obtained by ETPF
at ensemble size 103 resembles IS at ensemble size 105. The variance
obtained by ETPF with localization shown in Fig. b–e has
also improved.
The posterior estimation of the leading mode Z1, however,
degraded, while that of Z2 and Z3 improved. The
Kullback–Leibler divergence for the leading mode is 0.73 (compared to 0.21
without localization), and for the second and third modes it is 0.2 and 0.18,
respectively (compared to 0.42 and 0.6 without localization). Variance of the
posteriors is larger when localization is applied for both methods. The
localized weights given by Eq. () vary less than the
non-localized weights given by Eq. (). Therefore the localized
pdf is less noisy than the non-localized pdf. However, localization applied
in the form of the Karhunen–Loeve expansion given by Eq. () does
not retain the imposed bounds on the modes Z, as we need to
invert a matrix product of eigenvalue and eigenvector matrices to obtain the
modes. Moreover, unlike ETKF, LETPF does not converge to ETPF as the
localization radius goes to infinity due to the transport problem being
univariate for LETPF and multivariate for ETPF.
Mean over 10 simulations after data assimilation for the data misfit (a), RMSE (b), and variance (c).
LETPF is shown in solid blue and LETKF in solid red.
ETPF is shown in dashed blue and ETKF in dashed red.
Same as Fig. but with localization.
Same as Fig. but with localization.