Particle filtering is a generic weighted ensemble data assimilation method based on sequential importance sampling, suited for nonlinear and non-Gaussian filtering problems. Unless the number of ensemble members scales exponentially with the problem size, particle filter (PF) algorithms experience weight degeneracy. This phenomenon is a manifestation of the curse of dimensionality that prevents the use of PF methods for high-dimensional data assimilation. The use of local analyses to counteract the curse of dimensionality was suggested early in the development of PF algorithms. However, implementing localisation in the PF is a challenge, because there is no simple and yet consistent way of gluing together locally updated particles across domains.

In this article, we review the ideas related to localisation and the PF in the geosciences. We introduce a generic and theoretical classification of local particle filter (LPF) algorithms, with an emphasis on the advantages and drawbacks of each category. Alongside the classification, we suggest practical solutions to the difficulties of local particle filtering, which lead to new implementations and improvements in the design of LPF algorithms.

The LPF algorithms are systematically tested and compared using twin experiments with the one-dimensional Lorenz

The ensemble Kalman filter

The EnKF can be viewed as a subclass of sequential Monte Carlo (MC) methods
whose analysis step relies on Gaussian distributions. However, observations
can have non-Gaussian error distributions, an example being the case of
bounded variables, which are frequent in ocean and land surface modelling or in atmospheric chemistry. Most geophysical dynamical models are nonlinear yielding non-Gaussian error distributions

When the Gaussian assumption is not fulfilled, Kalman filtering is
suboptimal. Iterative ensemble Kalman filter and smoother methods have been
developed to overcome these limitations, mainly by including variational
analysis in the algorithms

Unfortunately, there is no successful application of it to a significantly
high-dimensional DA problem. Unless the number of ensemble members scales
exponentially with the problem size, PF methods experience weight degeneracy
and yield poor estimates of the model state. This phenomenon is a symptom of
the curse of dimensionality and is the main obstacle to an application of PF
algorithms to most DA problems

Importance sampling is at the heart of PF methods where the goal is to
construct a sample of the posterior density (the conditional density) given
particles from the prior density using importance weights. The use of a
proposal transition density is a way to reduce the variance of the importance
weights, hence allowing the use of fewer particles. However, importance
sampling with a proposal density can lead to more costly algorithms that are
not necessarily rid of the curse of dimensionality

Resampling is the first improvement that was suggested in the bootstrap
algorithm

Hybridising PFs with EnKFs seems a promising approach for the application of
PF methods to high-dimensional DA, in which one can hope to take the best of
both worlds: the robustness of the EnKF and the Bayesian analysis of the PF. The
balance between the EnKF and the PF analysis must be chosen carefully.
Hybridisation especially suits the case where the number of significantly
nonlinear degrees of freedom is small compared to the others. Hybrid filters
have been applied, for example, to geophysical low-order models

In most geophysical systems, distant regions have an (almost) independent
evolution over short timescales. This idea was used in the EnKF to implement
localisation in the analysis

Section

We follow a state vector

The model can alternatively be described by

The components of the state vector

Let

The prediction operator

In this article, we consider the DA filtering problem that consists in
estimating

The PF is a class of sequential MC methods that produces, from the
realisations of

Inserting the particle representation
Eq. (

Applying Bayes' theorem to

Finally, an optional resampling step

Under reasonable assumptions on the prediction and correction operators and
on the sampling and resampling algorithms, it is possible to show that, in the limit

Eventually, the focus of this article is on the analysis step, that is, the
correction and the resampling. Hence,

Without resampling, PF methods are subject to weight degeneracy: after a few
assimilation cycles, one particle gets almost all the weight. The goal of
resampling is to reduce the variance of the weights by reinitialising the
ensemble. After this step, the ensemble is made of

In most resampling algorithms, highly probable particles are selected and
duplicated, while particles with low probability are discarded. It is
desirable that the selection of particles has a low impact on the empirical
density

Resampling introduces sampling noise. On the other hand, not resampling means
imparting computational time to highly improbable particles that have a very
low contribution to the empirical analysis density. Therefore, the choice of
the resampling frequency is critical in the design of PF algorithms. Common
criteria to decide if a resampling step is needed are based on measures of
the degeneracy, for example the maximum of the weights or the effective
ensemble size defined by

The correction and resampling steps of PF methods can be combined and
embedded into the so-called

In the “select and duplicate” resampling schemes, the coefficients of

If the coefficients of

Let

The SIR algorithm is recovered with the

Although the optimal importance proposal has appealing properties, its
computation is non-trivial. For the generic model with Gaussian additive
noise described in Appendix

The PF has been successfully applied to low-dimensional DA problems

Similar results are produced when applying one importance sampling step to
the Gaussian linear model described in Appendix

This phenomenon, well known in the PF literature, is often referred to as

At first sight, it might seem surprising that, although MC methods have a
convergence rate independent of the dimension, the curse of dimensionality
applies to PF methods. Yet the correction step

Empirical statistics of the maximum of the weights for one
importance sampling step applied to the Gaussian linear model of
Appendix

A quantitative description of the behaviour of weights for large values of

This result means that, in order to avoid the collapse of a PF method, the
number of particles

One objective of using proposal densities in PF methods is to reduce the
variance of the importance weights as discussed in
Sect.

Yet the OISIR algorithm still collapses even for low-order models, such as
the L96 model with

In a generic Gaussian linear model, the equivalent state dimension

By considering the definition of

This would not be the case in a PF algorithm that would be able to perform
local analyses, that is, when the influence of each observation is restricted
to a spatial neighbourhood of its location. The equivalent state dimension

In the EnKF literature, this idea is known as

The first issue is that the variation of the weights across local domains
irredeemably breaks the structure of the global particles. There is no
trivial way of recovering this global structure, i.e. gluing together the
locally updated particles. Global particles are required for the prediction
and sampling step

Second, if not carefully constructed, this gluing together could lead to
balance problems and sharp gradients in the fields

From now on, we will assume that our DA problem has a well-defined spatial
structure:

Each state variable is attached to a location, the

Each observation is attached to a location, the

There is a distance function between locations.

In the following sections, we discuss algorithms that address the two issues of local particle filtering (gluing and imbalance) and lead to implementations of domain localisation in PF methods. We divide the solutions into two categories.

In the first approach, independent analyses are performed at each grid point
by using only the observation sites that influence this grid point. This
leads to algorithms that are easy to define, to implement, and to parallelise.
However, there is no obvious relationship between state variables, which
could be problematic with respect to the imbalance issue. This approach is
used for example by

In the second approach, an analysis is performed at each observation site.
When assimilating the observation at a site, we partition the state space:
nearby grid points are updated, while distant grid point remain unchanged. In
this formalism, observations need to be assimilated sequentially, which makes
the algorithms harder to define and to parallelise but may mitigate the
imbalance issue. This approach is used, for example, by

From now on, the time subscript

Localisation is generally introduced in PF methods by allowing the analysis
weights to depend on the spatial position. In the (global) PF, the marginal
of the analysis density for each state variable is

With local analysis weights, the marginals of the analysis density are
uncoupled. This is the reason why localisation was introduced in the first
place, but as a drawback, the full analysis density is not known. The
simplest fix is to approximate the full density as the product of its
marginals:

In summary, in LPF methods, we keep the generic MC structure described in
Sect.

The principle of localisation in the PF, in particular
Eq. (

In the block particle filter algorithm of

To summarise, LPF algorithms using the SBD localisation formalism, hereafter
called LPF

The

the geometry of the blocks over which the weights are constant;

the local domain of each block, which gathers all observation sites used to compute the local weight;

the local resampling algorithm.

Most LPFs

Using parallelepipedal blocks is a standard geometric choice

In the clustered particle filter algorithms of

The general idea of domain localisation in the EnKF is that the analysis at
one grid point is computed using only the observation sites that lie within a
local region around this grid point, hereafter called the

The terminology adopted here (disk, radius, etc.) fits two-dimensional spatial spaces. Yet most geophysical models have a three-dimensional spatial structure, with typical uneven vertical scales that are usually much shorter than horizontal scales. For these models, the geometry of the local domains should be adapted accordingly.

Increasing the localisation radius allows one to take more observation sites into account, hence reducing the bias in the local analysis. It is also a means to reduce the spatial inhomogeneity by making the weights smoother in space.

The smoothness of the local weights is an important property. Indeed, spatial
discontinuities in the weights can lead to spatial discontinuities in the
updated particles. Again lifting ideas from the local EnKF methods, the
smoothness of the weights can be improved by tapering the influence of an
observation site with respect to its distance to the block centre as follows.
For the (global) PF, assuming that the observation sites are independent, the
unnormalised weights are computed according to

Algorithm 1 describes the analysis step for a generic LPF

In this algorithm, and in the rest of this article, the ensemble matrix

An illustration of the definition of blocks and local domains is displayed in
Fig.

The feasibility of PF methods using SBD localisation is discussed by

Example of geometry in the SBD localisation formalism for a two-dimensional space. The focus is on the block in the middle, which gathers 12 grid points. The local domain is circumscribed by a circle around the block centre, with potential observation sites outside the block.

The main mathematical result is that, under reasonable hypotheses, the error
on the analysis density for this algorithm can be bounded by the sum of a
bias and a variance term. The bias term is related to the block boundaries
and decreases exponentially with the diameter of the blocks, in number of
grid points. It is due to the fact that the correction is not Bayesian anymore, since only a subset of observations is used to update each block. The
exponential decrease is a demonstration of the

Resampling from the analysis density given by
Eq. (

On the other hand, blind assembling is likely to lead to unphysical
discontinuities in the updated particles, regardless of the spatial
smoothness of the analysis weights. More precisely, one builds

In order to mitigate the unphysical discontinuities, the analysis weights
must be spatially smooth, as mentioned in
Sect.

A first solution is to smooth out potential unphysical discontinuities by
averaging in space the locally resampled ensemble as follows. This method was
introduced by

Example of one-dimensional concatenation of particle

Let

If the resampling is performed using a “select and duplicate” algorithm (see
Sect.

Finally, the ensemble is updated as

Algorithm 2 describes the analysis step for a generic LPF

Blocks have size

The local weights are computed using
Eq. (

The function

The resampling method is the SU sampling algorithm.

The smoothing radius

The smoothing strength

Note that when the resampling method is the SU sampling algorithm, the
matrices

The smoothing-by-weights step is an ad hoc fix to reduce potential unphysical discontinuities after they have been introduced in the local resampling step. Its necessity hints that there is room for improvement in the design of the local resampling algorithms.

In this section, we study several properties of the local resampling algorithm that might help dealing with the discontinuity issue: balance, adjustment, and random numbers.

A “select and duplicate” sampling algorithm is said to be

A “select and duplicate” sampling algorithm is said to be

While performing the resampling independently for each block, one can use the same random number(s) in the local resampling of each block.

Using the same random number(s) for the resampling of all blocks avoids a
stochastic source of unphysical discontinuity. Choosing balanced and
adjustment-minimising resampling algorithms is an attempt to include some
kind of continuity in the map

As mentioned in Sect.

Applying optimal ensemble coupling to the SBD localisation frameworks results
in a local LET resampling method, whose local transformation at each block

To summarise, Algorithm 3 describes the analysis step for a generic
LPF

On each block, the linear transformation establishes a strong connection between the prior and the updated ensembles. Moreover, there is no stochastic variation of the coupling at each block. This means that the spatial coherence can be (at least partially) transferred from the prior to the updated ensemble.

In Sect.

For each state variable

In one dimension, this transport map is also known to be the

The computation of

According to the KDE theory, when the underlying distribution is Gaussian,
the optimal shape for

To summarise, Algorithm 4 describes the analysis step for a generic
LPF

The local resampling algorithm using anamorphosis is, as well as the
algorithm using optimal ensemble coupling, a deterministic transformation.
This means that unphysical discontinuities due to different random
realisations over the grid points are avoided. As explained by

When defining the prior and the corrected densities with
Eqs. (

The refinements of the resampling algorithms suggested in
Sect.

Due to computational considerations, the optimisation problem
Eq. (

The design of the resampling algorithm based on anamorphosis has been
inspired from the kernel density distribution mapping (KDDM) step of the LPF
algorithm of

In this section, we have constructed a generic SBD localisation framework,
which we have used to define the LPF

A smoothing-by-weights step can be applied after the local resampling step in order to reduce potential unphysical discontinuities.
Our method is a generalisation of the original smoothing designed by

Simple properties of the local resampling algorithms can be used in order to minimise the occurrences of unphysical
discontinuity as shown by

Using the principles of discrete OT, we have proposed a resampling algorithm based on a local version of the ETPF
of

By combining the continuous OT problem with the KDE theory, we have derived a new local resampling algorithm based on anamorphosis. We have shown how it helps mitigate the unphysical discontinuities.

In Sect.

We define the auxiliary quantities

The complexity of the LPF

When using the multinomial resampling of the SU sampling algorithm for the
local resampling, the total complexity of the resampling step is

When using optimal ensemble coupling, the resampling step is computationally
more expensive, because it requires to solve one optimisation problem for each
block. The minimisation coefficients
Eq. (

When using optimal transport in state space, every one-dimensional
anamorphosis is computed with complexity

When using the smoothing-by-weights step with the multinomial resampling or
the SU sampling algorithm, the smoothed ensemble
Eq. (

For comparison, the more costly operation in the local analysis of a local
EnKF algorithm is to compute the singular value decomposition of a

In this complexity analysis, the influence of the parameters

The localisation radius

For a local EnKF algorithm, gathering grid points into blocks is an
approximation that reduces the numerical cost of the analysis steps by
reducing the number of local analyses to perform. For an LPF

More discussion regarding the choice of the localisation radius

An essential property of PF algorithms is that they are asymptotically
Bayesian: as stated in Sect.

In the limit of very large localisation radius,

When using independent multinomial resampling or SU sampling for the local resampling, if one uses the
same random number for all blocks (this property is always
true if

When using the smoothing-by-weights step with the multinomial resampling or the SU sampling, if one uses the same random number for all
blocks,
then the smoothed ensemble Eq. (

When using optimal ensemble coupling for the local resampling, in the limit

When using independent multinomial resampling or SU sampling for the local resampling with
different random number for all blocks, then the updated particles
are distributed according to the product of the marginal analysis density Eq. (

For the same reason, when using anamorphosis for the local resampling, we could not find proof that the LPF

When using the smoothing-by-weights step with the multinomial resampling or the SU sampling, in the limit

In this section, we illustrate the performance of LPF

The distance between the truth and the analysis is measured with the average
analysis root mean square error, hereafter simply called the RMSE. To ensure
the convergence of the statistical indicators, the runs are at least

For the localisation, we assume that the grid points are positioned on an
axis with a regular spacing of

This filtering problem has been widely used to asses the performance of DA
algorithms. In this configuration, nonlinearities in the model are rather
mild and representative of synoptic scale meteorology, and the error
distributions are close to Gaussian. As a reference, the evolution of the
RMSE as a function of the ensemble size

The application of PF algorithms to this chaotic model without error leads to
a fast collapse. Even with stochastic models that account for some model
error, PF algorithms experience weight degeneracy when the model noise is too
low. Therefore, PF practitioners commonly include some additional jitter to
mitigate the collapse

First, the prediction and sampling step Eq. (

RMSE as a function of the ensemble size

Second, a regularisation step can be added after a full analysis cycle:

Both regularisation steps have numerical complexity

The exact LPF is recovered in the limit

With optimally tuned jitter for the standard L96 model, the bootstrap PF
algorithm requires about

We have proven in this case that the RMSE,
when computed between the observations

We define the standard S(IR)

Grid points are gathered into

Jitter is added after the integration using Eq. (

The local weights are computed using the Gaussian tapering of observation influence given by Eq. (

The local resampling is performed independently for each block with the adjustment-minimising SU sampling algorithm.

Jitter is added at the end of each assimilation cycle using Eq. (

We first check that, in this standard configuration, localisation is working
by testing the S(IR)

Nomenclature conventions for the
S(

List of all LPF

To evaluate the efficiency of the jitter, we experiment with the
S(IR)

RMSE as a function of the localisation radius

From these results, we can identify two regimes:

With low regularisation jitter (

With low integration jitter (

RMSE as a function of the integration jitter

RMSE as a function of the regularisation jitter

In the rest of this section, we take zero integration jitter (

To illustrate the influence of the size of the blocks, we compare the RMSEs
obtained by the S(IR)

From now on, unless specified otherwise, we systematically test our
algorithms with

RMSE as a function of the ensemble size

To illustrate the influence of the definition of the local weights, we
compare the RMSEs of the S(IR)

Figure

In this section, we test the refinements of the sampling algorithms proposed
in Sect.

the S(IR

the S(IR

RMSE as a function of the ensemble size

Figure

However, using the same random number for the resampling of each block does not produce significantly lower RMSEs. This method is insufficient to reduce the number of unphysical discontinuities introduced when assembling the locally updated particles. This is probably a consequence of the fact that the SU sampling algorithm only uses one random number to compute the resampling map. It also suggests that the specific realisation of this random number has a weak influence on long-term statistical properties.

RMSE as a function of the ensemble size

In the following, when using the SU sampling algorithm, we always choose its adjustment-minimising form, but we do not enforce the same random numbers over different blocks.

According to Eqs. (

It is hence a common procedure in ensemble DA to scale the regularisation
jitter with statistical properties of the ensemble. In a (global) PF context,
practitioners often “colourise” the Gaussian regularisation jitter with the
empirical covariances of the ensemble as described by

More precisely, the regularisation jitter has zero mean and

In practice, the

Colourisation could be added to the integration jitter as well. However in
this case, scaling the noise with the ensemble is less justified than for the
regularisation jitter. Indeed, the integration noise is inherent to the
perturbed model that is used to evolve each ensemble member independently.
Hence PF practitioners often take a time-independent Gaussian integration
noise whose covariance matrix does not depend on the ensemble but includes
some off-diagonal terms based on the distance between grid points

The

In the analysis step of LPF

A first approach could be to scale the regularisation with the locally
resampled ensemble, since in this case all weights are equal. This is the
approach followed by

In a second approach, the anomaly matrix

The coloured regularisation step has complexity

The exact LPF is recovered in the limit

We experiment with the S(IR)

Figure

RMSE as a function of the ensemble size

In this section, we look for the potential benefits of adding a smoothing-by-weights step as presented in Sect.

Alongside the smoothing-by-weights step come two additional tuning
parameters: the smoothing strength

At a fixed smoothing strength

RMSE as a function of the smoothing radius

Based on extensive tests of the S(IR)

In general

Optimal values for

Optimal values for

In the following, when using the smoothing-by-weights method, we take

From these results, we conclude that the smoothing-by-weights step is an efficient way of mitigating the unphysical discontinuities that were introduced when assembling the locally updated particles, especially when combined with the coloured noise regularisation jitter method.

RMSE as a function of the ensemble size

In this section, we evaluate the efficiency of using the optimal transport in
ensemble space as a way to mitigate the unphysical discontinuities of the
local resampling step by experimenting the
S(IT

For each block, the local linear transformation is computed by solving the
minimisation problem Eq. (

Optimal values for the distance radius

In the following, when using the optimal ensemble coupling algorithm, we take

We have also performed extensive tests with

The fact that neither the use of larger blocks nor the smoothing-by-weights step significantly improves the RMSE score when using optimal ensemble coupling indicates that this local resampling method is indeed an efficient way of mitigating the unphysical discontinuities inherent to assembling the locally updated particles.

In this section, we test the efficiency of using the optimal transport in
state space as a way to mitigate the unphysical discontinuities of the local
resampling step by experimenting the S(IT

RMSE as a function of the ensemble size

As mentioned in Sect.

In the following, when using the anamorphosis resampling algorithm, we take
the standard value

We have also performed extensive tests with

RMSE as a function of the ensemble size

These latter remarks, alongside significantly lower RMSEs for the
S(IT

To summarise, Fig.

In this standard, mildly nonlinear configuration where error distributions
are close to Gaussian, the EnKF performs very well, and the LPF

RMSE as a function of the ensemble size

Note that our objective is not to design LPF algorithms that beat the EnKF in
all situations, but rather to incrementally improve the PF. However, specific
configurations in which the EnKF fails and the PF succeeds can easily be
conceived by increasing nonlinearities. Such a configuration is studied in
Appendix

As a complement to this RMSE test series, rank histograms for several LPFs
are computed, reported, and discussed in
Appendix

In this section, we illustrate the performance of LPF

As with the L96 model, the distance between the truth and the analysis is
measured with the average analysis RMSE. The runs are

For the localisation, we use the underlying physical space with the Euclidean
distance. The geometry of the blocks and domain are constructed as described
by Fig.

As a reference, we first compute the RMSEs of the EnKF with this model.
Figure

The ETKF requires at least

With

In this section, we test the LPF

RMSE as a function of the ensemble size

For each ensemble size

We take zero integration jitter (

The localisation radius

The regularisation jitter

For the algorithms using the SU sampling algorithm (i.e. the S(IR)

For the algorithms using optimal ensemble coupling or anamorphosis (i.e. the S(IT

When using the smoothing-by-weights method, we take the smoothing strength

When using the optimal ensemble coupling for the local resampling step, the distance radius

When using the anamorphosis for the local resampling step, we take the regularisation bandwidth

Figure

With such a large model, we expected the coloured noise regularisation jitter
method to be much more effective than the white noise method, because the
colourisation reduces potential spatial discontinuities in the jitter. We
observe indeed that the S(IR)

RMSE as a function of the ensemble size

Due to relatively high computational times, we restricted our study to
reasonable ensemble sizes,

Finally, it should be noted that for the S(IT

In the SBD localisation formalism, each block of grid points is updated using
the local domain of observation sites that may influence these grid points.
In the sequential–observation (SO) localisation formalism, we use a
different approach. Observations are assimilated sequentially, and
assimilating the observation at a site should only update nearby grid points.
LPF algorithms using the SO localisation formalism will be called
LPF

The

In this section, we set

These algorithms are designed to assimilate one observation at a time. Hence,
a full assimilation cycle requires

Following

The first region

The second region

The third region

The meaning of “correlated” is to be understood as a prior hypothesis, where
we define a valid tapering matrix

The

For any region

Example of the

Without loss of generality, the conditional density is decomposed into

With the

So far, the SO formalism looks elegant. The resulting assimilation schemes avoid the discontinuity issue inherent to the SBD formalism by using conditional updates of the ensemble.

However, this kind of update seems hopeless in a PF context. Indeed the
factors

Similar principles were used to design the multivariate rank histogram filter
(MRHF) of

In the MRHF analysis, the state variables are updated sequentially according
to the conditional density

The MRHF could be used as a potential implementation of the SO localisation
formalism. However, assimilating one observation requires the computation of

In the following sections, we introduce two different algorithms that
implement the SO formalism (with the

The first algorithm that we introduce to implement the SO formalism using the
“importance, resampling, propagation” scheme is the LPF of

Using the global unnormalised importance weights
Eq. (

The resampling map

At the observation site,

The formulas for the

The second algorithm that we introduce to implement the SO formalism using
the “importance, resampling, propagation” scheme is based on the ensemble
Kalman particle filter (EnKPF), a Gaussian mixture hybrid ensemble filter
designed by

Since the update is propagated using second-order moments, one first needs to
compute the covariance matrix of the prior ensemble:

Using the global unnormalised importance weights
Eq. (

An adjustment-minimising resampling algorithm can be used to minimise the number of updates

The resampling algorithms based on OT in ensemble space or in state space, as derived in Sect.

For each particle the update on

A single analysis step according to this second-order algorithm is summarised by Algorithm 8 in a generic context, with any resampling algorithm.

In this section, we have introduced a generic SO localisation framework,
which we have used to define the LPF

The first algorithm is the LPF of

The second algorithm was inspired by the EnKPF of

Let

With LPF

By definition of the

The LPF

The unnormalised importance weights Eq. (

Any distance that needs to be computed relative to the site of observation

In the algorithm based on the second-order propagation scheme, the

Gathering observation sites into blocks reduces the number of sequential
assimilations from

In this section, we illustrate the performance of the LPF

For the same reasons as with LPF

Nomenclature conventions for the
S(I

List of all LPF

With the regularisation method described in
Sect.

the ensemble size

the localisation radius

the standard deviation

As mentioned in Sect.

In the S(IRP

With the regularisation method described in
Sect.

the ensemble size

the localisation radius

the regularisation jitter

When using optimal ensemble coupling for the local resampling
(step 4 of
Algorithm 8), the local minimisation coefficients are computed using
Eq. (

RMSE as a function of the ensemble size

The evolution of the RMSE as a function of the ensemble size

The evolution of the RMSE as a function of the ensemble size

In this section, we illustrate the performance of a selection of
LPF

RMSE as a function of the ensemble size

Characteristics of the algorithms tested with the BV model in the HR
configuration (Fig.

As with the CR configuration, all geometrical considerations (blocks and
domains,

Instantaneous analysis RMSE for the selection of algorithms detailed
in Table

For this test series, the selection of algorithms is listed in
Table

Figure

Thanks to the uniformly distributed observation network, the posterior
probability density functions are close to Gaussian. Therefore the LETKF algorithm can efficiently reconstruct a
good approximation of the true state. As expected with this high-dimensional
DA problem, the algorithms using a second-order truncation (the LETKF and the
S(I

For the S(IR)

Without parallelisation, we observe that the

The difference between the LPF

The curse of dimensionality is a rather well-understood phenomenon in the statistical literature, and it is the reason why PF methods fail when applied to high-dimensional DA problems. We have recalled the main results related to weight degeneracy of PFs, and why the use of localisation can be used as a solution. Yet implementing localisation in PF analysis raises two major issues: the gluing of locally updated particles and potential physical imbalance in the updated particles. Adequate solutions to these issues are not obvious, witness the few but dissimilar LPF algorithms developed in the geophysical literature. In this article we have proposed a theoretical classification of LPF algorithms into two categories. For each category, we have presented the challenges of local particle filtering and have reviewed the ideas that lead to practical implementation of LPFs. Some of them, already in the literature, have been detailed and sometimes generalised, while others are new in this field and yield improvements in the design of LPF methods.

With the LPF

In the LPF

With localisation, a bias is introduced in the LPF analyses. We have shown that, depending on the localisation parameterisation, some methods can yield an analysis step equivalent to that of global PF methods, which are known to be asymptotically Bayesian.

We have implemented and systematically tested the LPF algorithms with twin
simulations of the L96 model and the BV model. A few observations could be
made from these experiments. With these models, implementing localisation is
simple and works as expected: the
LPFs yield acceptable RMSE scores, even with small ensembles, in regimes
where global PF algorithms are degenerate. In terms of RMSEs, there is no
clear advantage of using Poterjoy's propagation method (designed to avoid
unphysical discontinuities) over the (simpler) LPF

The successful application of the LPFs to DA problems with a perfect model is largely due to the use of regularisation jitter. Using regularisation jitter introduces an additional bias in the analysis alongside an extra tuning parameter. For our numerical experiments, we have introduced two jittering method: either using regularisation noise with fixed statistical properties (white noise) or by scaling the noise with the ensemble anomalies (coloured noise). We have discussed the relative performance of each method and concluded that there is room for improvement in the design of regularisation jitter methods for PFs.

In conclusion, introducing localisation in the particle filter is a relatively young topic that can benefit from more theoretical and practical developments.

First, the resampling step is the main ingredient in the success, or failure, of an LPF algorithm. The approaches based on optimal transport offer an elegant and quite efficient framework to deal with the discontinuity issue inherent to local resampling. However, the algorithms derived in this article could be improved. For example, it would be desirable to avoid the systematic reduction to one-dimensional problems when using optimal transport in state space. Besides this, other frameworks for local resampling based on other theories could be conceived.

Second, the design of the regularisation jitter methods can be largely improved. Regularisation jitter is mandatory when the model is perfect. Even with stochastic models, it can be beneficial, for example, when the magnitude of the model noise is too small for the LPFs to perform well. Ideally, the regularisation jitter methods should be adaptive and built concurrently with the localisation method.

Third, with the localisation framework presented in this article, one cannot directly assimilate non-local observations. The ability to assimilate non-local observations becomes increasingly important with the prominence of satellite observations.

Finally, our numerical illustration with the BV model in the HR configuration is successful and shows that the LPF algorithms have the potential to work with high-dimensional systems. Nevertheless further research is needed to see if the LPFs can be used with realistic models. Such an application would require an adequate definition of the model noise and the observation error covariance matrix. Even if the local resampling methods have been designed to minimise the unphysical discontinuities, this will have to be carefully checked, because this is a critical point in the success of the LPF. Last, the regularisation jitter method has to be chosen and tuned in adequation with the model noise. In particular, the magnitude of the jitter will almost certainly depend on the state variable.

No data sets were used in this article.

The Gaussian linear model is the simplest model with size

The Gaussian linear model can be generalised to include nonlinearity in the
model

The Lorenz 1996 model

In the standard configuration,

The barotropic vorticity model describes the evolution of the vorticity field
of a two-dimensional incompressible homogeneous fluid in the

In these equations,

The equations are solved with

At time

At time

The advection of

Integrate

The coarse-resolution configuration is based on the following set of physical
parameters:

The initial true vorticity field for the DA twin experiments is the vorticity
obtained after a run of

We have checked that the vorticity flow remains stationary over the total
simulation time of our DA twin experiments chosen to be

For the high-resolution configuration, the physical parameters are

Following

In

As a complement to the mildly nonlinear test series of
Sects.

Similarly to the mildly nonlinear test series, the distance between the truth
and the analysis is measured with the average analysis RMSE. The runs are

As expected in this strongly nonlinear test series, the EnKF fails at
accurately reconstructing the true state. By contrast, all LPFs yield, at some
point, an RMSE under

RMSE as a function of the ensemble size

As a complement to the RMSE test series, we compute rank histograms of the
ensembles

Rank histograms computed with the L96 model in the standard
configuration (see Appendix

Rank histograms for the selection of algorithms detailed in
Table

Several algorithms are selected with characteristics detailed in
Table

The histogram of the EnKF is quite flat in the middle, and its edges reflect a small overdispersion. The histogram of
the tuned S(IR)

In summary, the rank histograms of the LPFs are in general rather flat. The ensemble are more or less overdispersive; this is a consequence of the use of regularisation jitter, necessary for avoiding the filter divergence. As most PF methods, the LPFs yield a poor representation of the distribution tails.

We describe here the multinomial and the SU sampling algorithms, which are
the most common resampling algorithms. In this algorithms, highly probable
particles are selected and duplicated, while particles with low probability
are discarded. Algorithms 9 and 10 describe how to construct the resampling
map

Both algorithms only require the cumulative weights

AF and MB have made an equally substantial, direct, and intellectual contribution to all three parts of the work: overview of the literature, algorithm development, and numerical experiments. Both authors have prepared the manuscript and approved it for publication.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Numerical modeling, predictability and data assimilation in weather, ocean and climate: A special issue honoring the legacy of Anna Trevisan (1946–2016)”. It is not associated with a conference.

The authors thank the editor, Olivier Talagrand, and three reviewers, Stephen G. Penny and two anonymous reviewers, for their useful comments, suggestions and thorough reading of the manuscript. The authors are grateful to Patrick Raanes for enlightening debates and to Sebastian Reich for suggestions. CEREA is a member of Institut Pierre–Simon Laplace (IPSL). Edited by: Olivier Talagrand Reviewed by: Stephen G. Penny and two anonymous referees