Introduction
Terrestrial ecosystem carbon exchange is a fundamental part of the global
carbon cycle link to biosphere processes. Atmospheric CO2
measurements indicate the presence of a global land C sink, i.e.
uptake by the terrestrial biosphere exceeds losses. However, relative to all
major terms in the global carbon budget, the global land sink exhibits both
the largest inter-annual variability and the largest uncertainty
. The terrestrial carbon budget uncertainty stems largely
from unknowns in the size, spatial distribution and temporal dynamics of the
major terrestrial carbon pools. As a result, there is little agreement among
modelled land sink projections for the 21st century , reflecting uncertainty in knowledge on the current state of the
terrestrial C cycle and its dynamics.
In recent years a growing volume of data from flux towers, satellites and
plant trait databases has been used to constrain some of the key components
of the terrestrial carbon cycle e.g.. In particular, a range of ecosystem carbon models and data sets
have been brought together in model–data fusion (MDF) frameworks to produce
an enhanced analysis of ecosystem carbon cycling e.g.. Where multiple data
streams are available, MDF approaches can provide an extensive insight into
carbon pool dynamics, turnover rates and carbon allocation fractions
. However, even at research-intensive
sites, MDF studies can produce a wide range of acceptable model parameter
sets, due to underdetermination of the carbon budget with available data.
Some of these optimized parameter sets, even though they generate realistic
fluxes over short timescales, are associated with major changes to larger
carbon pools (soil, wood) that are nonsensical . For regional-
and global-scale model implementation, the lack of in situ measurements
amplifies this problem, sometimes referred to as equifinality
. Ultimately, we need to overcome data limitations and
underdetermination by integrating models and ecosystem knowledge in a common
framework. This framework must ensure ecologically realistic outcomes, while
still encompassing (i.e. effectively quantifying) the uncertainty associated
with parameter estimation given observation errors .
Although a range of process-based models have been used to represent the
dynamics of the terrestrial carbon cycle and land–atmosphere CO2
exchange e.g., there are advantages in
using simpler models to estimate ecosystem carbon state variables. Firstly,
there is a trade-off between model complexity, such as the number of model
parameters, and a model's ability to reproduce observations
e.g.: therefore a low-complexity model is preferable
when it can reproduce ecosystem observations with comparable skill. Secondly,
complex models are often computationally expensive, and this is an inhibiting
factor when using iterative methods (such as Monte Carlo approaches) to
estimate model parameters and their uncertainty. Ideally, the key terms of
ecosystem carbon dynamics can be constrained by combining ecosystem
observations with a model of appropriate complexity in a computationally
efficient MDF framework.
Previous MDF studies have invariably relied on net ecosystem exchange (NEE)
measurements (real and synthetic), along with other site-level observations
. In a global context, the FLUXNET flux-tower network
consists of hundreds of flux tower sites where
hectare-scale NEE measurements have been made over the past two decades. In
addition to NEE, complementary site-level biometric data can help resolve
model parameters and state variables in an MDF context , alleviating the problem of underdetermination.
However, the terrestrial biosphere will inevitably remain poorly sampled by
FLUXNET. Alternative estimates of NEE from atmospheric CO2
measurements e.g. are only produced at
continental-scale resolutions.
Therefore, given the limited span of the FLUXNET flux-tower network, are spatially resolved global
carbon cycle analyses limited by the sparsity of eddy flux and biometric
data?
NEE, the difference between photosynthesis and ecosystem respiration, is
a function of the dynamics of all carbon pools over a range of timescales. In
the absence of NEE observations, model NEE estimates depend on a knowledge of
carbon pool sizes and model parameter values. In reality, carbon pools and
model parameters (especially those related to plant allocation fractions and
pool turnover rates) are poorly constrained, and therefore NEE estimates are
subject to a comparably large uncertainty. Nonetheless, fundamental knowledge
on ecosystem behaviour can potentially be used to overcome the lack of
location-specific data or parameter values.
For example, while parameters
related to phenology, C allocation and turnover may vary across
multiple orders of magnitude , these parameters
are strongly correlated e.g., and the range of possible
parameter configurations is therefore limited. Such examples include
correlations between leaf lifespan and leaf mass per area ,
leaf area index and total foliar N , and
between foliar and root biomass . These correlations can
confine parameter searches to a smaller hyper-volume.
Equally, while
ecosystems exhibit a large range of non-steady-state dynamic behaviours,
strong inter-relationships are expected between inputs, outputs, carbon pool
magnitudes and turnover rates .
introduced the concept of reality constraints (or internal model constraints) on carbon pool dynamics within a carbon cycle MDF analysis: such constraints on the model state can potentially
be used to improve estimates of model parameters.
Here we propose that a broad range of model parameter
combinations can be discarded when phenology, carbon allocation, turnover rates and pool dynamics are considered
ecologically “nonsensical”. We seek to address the following question:
can we improve ecosystem model parameter and NEE estimates by incorporating
ecological “common sense” into carbon cycle MDF analyses?
In this paper we propose a series of ecological and dynamic constraints
(EDCs) on model parameters: these include turnover and allocation parameter
inter-relations, carbon pool dynamics and steady-state proximity conditions
(Sect. 2). We quantify the added value of imposing EDCs in synthetic and real-data MDF contexts using a simple ecosystem carbon model, by measuring bias
and confidence interval reductions of carbon cycle analyses relative to
independent data (Sect. 3). Finally, we discuss the prospects and limitations
of our approach, as well as the implications of a wider EDC implementation in
terrestrial carbon cycle MDF methods (Sect. 4).
DALEC2 model parameters, descriptions, and minimum–maximum parameter values: the corresponding DALEC2 equations are fully described in Appendix A.
Parameter
Description
Range
fauto
autotrophic respiration fraction
0.3–0.7
flab
fraction of GPP allocated to labile C pool
0.01–0.5
ffol
fraction of GPP allocated to foliage
0.01–0.5
froo
fraction of GPP allocated to fine roots
0.01–0.5
fwoo
fraction of GPP allocated to wood
0.01–0.5
θwoo
woody C turnover rate
2.5×10-5–10-3d-1
θroo
fine root C turnover rate
10-4–10-2d-1
θlit
litter C turnover rate
10-4–10-2d-1
θsom
soil organic C turnover rate
10-7–10-3d-1
θmin
litter mineralization rate
10-2–10-5d-1
Θ
temperature dependence exponent factor
0.018–0.08
donset
leaf onset day
1–365
dfall
leaf fall day
1–365
ceff
canopy efficiency parameter
10–100
clma
leaf mass per area
10–400gCm-2
clf
annual leaf loss fraction
1/8–1
cronset
labile C release period
10–100 day
crfall
leaf-fall period
20–150 day
Clabt
labile C pool at time t
20–2000gCm-2
Cfolt
foliar C pool at time t
20–2000gCm-2
Croot
fine root C pool at time t
20–2000gCm-2
Cwoot
above- & below-ground woody C pool at time t
100–105gCm-2
Clitt
litter C pool at time t
20–2000gCm-2
Csomt
soil organic C pool at time t
100–2×105gCm-2
1fwoo is equivalent to 1 – fauto –
ffol – flab.
Methods
Here we present a series of EDCs for
a daily box budget terrestrial C cycle model, the Data Assimilation
Linked Ecosystem Carbon model version two (DALEC2). Within an MDF context, we
test the added value of implementing EDCs. Our aims are (1) to quantify our
ability to estimate DALEC2 parameters and NEE within a synthetic framework,
and (2) to validate our ability to estimate NEE at three temperate forest
AmeriFlux sites. We use simulated and real observations of (a)
satellite-derived leaf area index (LAI) and (b) soil organic carbon from the
Harmonized World Soil Database (HWSD, Hiederer and Köchy, 2012) in our
MDF analyses. The choice of these two data sets serves as an analogue for the
limited ecosystem carbon data sets available on a global scale.
DALEC2
DALEC has been extensively used in MDF frameworks e.g.amongst
others. In particular, a range of
MDF approaches were used in the REFLEX project, where ecosystem observations
were assimilated into DALEC to produce carbon state analyses .
Here we use the DALEC2 ecosystem carbon balance model, which
combines components of DALEC evergreen and DALEC deciduous
into a single model.
Gross primary production (GPP) in DALEC2 is determined from the aggregated canopy model
, and is allocated to the biomass pools (foliar, labile,
wood, and fine roots) and to autotrophic respiration (Ra); degraded carbon
from biomass pools goes to two dead organic matter pools with temperature-dependent losses (heterotrophic respiration, Rh). The net
ecosystem exchange is summarized as NEE=Ra+Rh-GPP. The C flow in DALEC2 is determined as
a function of 23 parameters (including six initial carbon pool states,
Table 1). We henceforth refer to the 23 parameters required to initiate
DALEC2 as a parameter vector x. DALEC2 C pools and fluxes are
iteratively calculated at a daily time step: the DALEC2 model equations are
fully described in Appendix A.
We henceforth refer to the ensemble of all
model state variables (such as daily NEE, GPP, respiration terms and carbon
pool trajectories) as DALEC2(x).
Ecological and dynamic constraints
In previous work, DALEC MDF approaches did not explicitly impose any conditions on the
inter-relationships between model parameters, therefore parameter prior
information had only consisted of prescribed parameter ranges. In reality,
broader ecological knowledge can be informative in terms of the
inter-relationships between parameter values. For example, long-term leaf
turnover rate must be faster than woody biomass turnover
e.g.: such a relationship can provide a relative
constraint on model parameter values, without imposing any further
constraints to the prior parameter ranges (Table 1).
Here we propose a sequence of ecological and dynamic constraints (EDCs) on
DALEC2 parameters and pool dynamics. For any given DALEC2 parameter vector
x, all EDCs presented in this section (henceforth EDC 1,
EDC 2, etc.) are implemented. The probability of parameters
(henceforth PEDC(DALEC2(x))) is 1 if all EDCs are
met, otherwise PEDC(DALEC2(x))=0.
All DALEC2 parameters (allocation fractions fauto, flab, ffol,
froo, fwoo; turnover rate parameters θwoo, θroo,
θlit, θsom, θmin, Θ; canopy parameters
donset, dfall, ceff, clma, clf,
cronset, crfall; carbon pools at time t Clabt, Cfolt,
Cwoot, Csomt, Clitt, Csomt)
are described in Table 1.
Turnover constraints
We impose the following constraints on the relative sizes of turnover rates:
EDC 1:θsom<θlit,EDC 2:θsom<θmin,EDC 3:clf>1-(1-θwoo)365.25,EDC 4:(1-θroo)N>Πi=1N(1-θsomeΘTi),
where Ti are daily temperature values during an N-day time
window (e.g. 3 years). These constraints ensure the turnover rate ratios
are consistent with knowledge of the carbon pool relative residence times
e.g.. In particular, we
expect a faster litter turnover in contrast to soil organic matter (SOM)
turnover (EDC 1), a faster conversion rate of litter to SOM relative
to SOM turnover (EDC 2), the annual leaf loss fraction is greater
than the annual woody biomass loss fraction (EDC 3), and a faster
fine root turnover in contrast to SOM turnover (EDC 4).
Root–Foliar C allocation constraints
Strong correlations are expected between foliar and fine root carbon pools
e.g.. We constrain the C allocation
and dynamics of the root and foliar pools:
EDC 5: 0.2froo<ffol+flab<5froo,EDC 6: 0.2Cfol‾<Croo‾<5Cfol‾,
where Cfol‾ and Croo‾ are the
mean foliar and fine root carbon pool sizes over the model run period.
EDC 5 ensures that the GPP allocated fraction to Croo
and Cfol (directly or via the labile C pool) are within
a factor of 5 of each other. EDC 6 ensures that the mean fine root
and foliar pool sizes are within a factor of 5 of each other.
Carbon pool growth
While we expect pools to potentially grow through time, we assume no recent
disturbance and therefore limit the relative growth rate of pools. We
constrain pool growth as follows:
EDC 7:Cpoolyear=n‾Cpoolyear=1‾<1+Gmaxn-110,
where Cpoolyear=1‾ is the mean carbon pool
size in year 1, and Cpoolyear=n‾ is
the mean carbon pool size after n-1years. We choose a value
of Gmax=0.1, which is equivalent to a 10 % yearly growth rate
(or doubling of carbon over 10 yr) as the maximum growth rate for
each pool in EDC 7. This assumption is conservative, given data on
global forest biomass growth rates .
Carbon pool exponential decay trajectories
While carbon pools are expected to grow and contract through time, in the
absence of major and recent disturbance events carbon pool trajectories are
expected to exhibit gradual changes on inter-annual timescales
e.g.. Under these circumstances, rapid exponential
decay in modelled DALEC2 carbon pools can only occur as a result of an
ecologically inconsistent x. We examine the system response within
a 3-year period by imposing a constraint on exponential pool trajectories
(Fig. ): we numerically fit an exponential decay curve
a+bect to all carbon pools, where t is time in days, and a, b and
c are the fitted exponential decay parameters.
DALEC2 pool trajectories are rejected if the half-life of carbon pool changes
is less than 3 years, i.e.
EDC 8:c<-365.25×3log(2).
We fully describe the numerical derivation of c in Appendix B.
Exponential decay test (EDC 8) performed on nine example normalized
Cpool trajectories over a 3 yr time span. The Cpool
trajectories are normalized such that Cpool=1 at t=0. Examples 1–5 were
accepted (EDC 8=1) and examples 6–9 were rejected (EDC 8=0). The exponential decay fit (dashed line) is shown for pool trajectories where EDC 8=0.
Steady state proximity
For ecosystems with no recent disturbance events, we propose that each pool
is within an order of magnitude of its steady-state attractor. We use mean
gross primary production (Fgpp‾) as a proxy for
long-term GPP to estimate the steady-state attractors,
Cpool∞, of four carbon pools (SOM, litter, wood and
root). The steady-state attractors for Csom, Clit,
Cwoo and Croo are analytically derived as follows:
Csom∞=(fwoo+(ffol+froo+flab)θmin)Fgpp‾(θmin+θlit)θsomeΘT‾,Clit∞=(ffol+froo+flab)Fgpp‾θliteΘT‾,Cwoo∞=fwooFgpp‾θwoo,Croo∞=frooFgpp‾θwoo,
where T‾ is the mean annual temperature (∘C). For each
pool, we impose an order-of-magnitude constraint on the proximity of
Cpool∞ from the initial Cpool value:
EDCs 9–12:Cpool010<Cpool∞<10Cpool0,
where Cpool0 is the initial Csom, Clit,
Cwoo and Croo value for EDCs 9, 10, 11 and 12
respectively.
The 12 presented EDCs are what we believe to be the most ecologically
suitable constraints on DALEC2 parameters and state variables, and are based
on broader ecological knowledge of carbon dynamics. We discuss the advantages
and the limitations of the proposed EDCs in Sect. 4 of this paper.
Model–data fusion
Given LAI observations, soil organic carbon estimates, prior parameter ranges
(Table 1) and EDCs (Sect. 2.2), our aim for each experiment is to estimate
the probability distribution of parameters x. We assume no prior knowledge, other
than the parameter ranges shown in Table 1: we therefore prescribe a uniform
(i.e. non-informative) prior probability distribution onto all parameters. Within
a Bayesian framework e.g., we combine the
above-mentioned information to derive the posterior probability density
function of x, P(x|O), where
P(x|O)∝P(O|x)⋅Prange(x)⋅PEDC(DALEC2(x)).
P(O|x) is the probability of the observations given
x, Prange(x)=1 if all parameters are within the
ranges prescribed in Table 1 (otherwise Prange(x)=0), and
PEDC(DALEC2(x))=1 if all EDCs are met (otherwise
PEDC(DALEC2(x))=0). For N observations, we
derive the observation probability given x, P(O|x),
as follows:
P(O|x)=e-12∑n=1N(Mn-On)2/σn2,
where On is the nth observation, Mn is the corresponding state
variable, and σn2 is the nth error variance for each observation
e.g.: here we assume no error covariance between
observation errors.
We employ an adaptive Metropolis Hastings Markov Chain Monte Carlo (MHMCMC)
approach to draw 5×106 samples from P(x|O). This
approach has been widely used to estimate the probability density function of
ecosystem model parameters amongst others and is ideal to explore
parameter space without a need to define normal prior distributions for each
parameter e.g.. We repeat the MHMCMC algorithm four
times (i.e. four chains), to ensure convergence between P(x|O) distributions from each chain. To minimize sample correlations we
use 500 x samples from the latter half of the accepted parameter
vectors. We describe the details of our MHMCMC approach in Appendix C.
Synthetic truth – DALEC2 analyses
To quantify our ability to estimate synthetic DALEC2 ecosystem states, we
perform the MDF approach over a 3-year period using LAI and SOM
observations created from a synthetic DALEC2 truth, based on known DALEC2 parameters.
Our choice of synthetic DALEC2 states represents globally spanning data sets of satellite LAI retrievals and
soil carbon map data.
Based on 40 DALEC2 parameter combinations, we create 40 synthetic data sets representing typical
temperate forest carbon dynamics, with 3 years of semi-continuous LAI
data and one simulated soil organic carbon estimate.
We use the 3-year
meteorology drivers (temperate climate) from the REFLEX synthetic experiments
.
We select 40 synthetic parameter combinations
by randomly sampling parameter vectors
x within the DALEC2 parameter space (Table 1), where (i)
PEDC(DALEC2(x))=1, and (ii) x values are
relevant to temperate forest ecosystems (see Appendix D). We remove
approximately 95 % of daily LAI points to create an 8-day
resolution semi-continuous LAI time-series. We add noise to the remaining
3 yr synthetic DALEC2 LAI: each LAI value is multiplied by a random
error factor of 2N(0,1), where N(0,1) is a random number derived from
a normal distribution with a mean of zero and a standard deviation of 1. For
each synthetic soil carbon observation, we multiply Csom0 at t=0 by a random error factor of 2N(0,1). We fully explain the
derivation of the synthetic experiment parameter vectors, (henceforth
s) in Appendix D.
We perform the MHMCMC and label the posterior parameter ensemble (4×500×40x samples) as xsSTA (standard
synthetic MDF) and xsEDC (synthetic MDF with EDCs). We
assign an uncertainty factor of 2 to all synthetic observations, hence On
and Mn are log-transformed observations and σn=log(2).
For each posterior DALEC2 x, we determine the log-normalized
parameter-space error ϵ(x) by comparing x with its
corresponding synthetic truth vector s:
ϵ(x)=∑n=1Nlog(x(n))-log(s(n))log(x(n)max)-log(x(n)min)2N,
where x(n) and s(n) represent the nth parameters of x and
s, the N is the number of parameters in x, and
x(n)min, x(n)max are the minimum and maximum
parameter values (see Table 1). To assess the parameter estimation capability
for each experiment, we derive the ϵ (x) for each parameter
vector in (a) xsSTA (b) xsEDC and (c) for
uniformly random samples where Prange(x)=1 (henceforth
xsRAN). We refer to the ensemble of ϵ(x)
values for xsSTA, xsEDC and
xsRAN as E(xsSTA),
E(xsEDC) and E(xsRAN). We
quantify the overall EDC associated error reduction (IEDC)
as follows:
IEDC=Ẽ(xsRAN)-Ẽ(xsEDC)Ẽ(xsRAN)-Ẽ(xsSTA)-1×100%,
where Ẽ represents the median of E for each posterior
parameter ensemble. This allows us to assess the relative improvement of
xsEDC over xsSTA parameter estimates
against the xsRAN “zero-knowledge” case. In addition, we
determine the IEDC for two parameter subgroups: (a) directly
constrained parameters, and (b) indirectly constrained parameters. We assign
clf, cronset, crfall, donset,
dfall and Csom0 to parameter group A: these
parameters can be directly inferred from the LAI and soil organic carbon
observations. We assign the remaining parameters to parameter group B: these
can only be inferred from the DALEC2 model structure and – potentially –
EDCs.
Finally we compare NEE from DALEC2(xsEDC) and
DALEC2(xsSTA) against the NEE synthetic “truths” –
DALEC2(s).
AmeriFlux – DALEC2 analyses
For the flux-tower experiments, we constrain DALEC2 parameters using (a)
MODIS derived Leaf Area Index (LAI), and (b) total soil carbon from the
harmonized world soil database HWSD. We perform
daily resolution 3-year DALEC2 analyses for three forest categories:
evergreen needleleaf (ENF), deciduous broadleaf (DBF) and mixed forest
(MF). We chose one AmeriFlux site from each forest type. To establish
a suitable site for our method we chose sites with NEE data spanning across
3 years between 2001 and 2010.
Our selected sites for each forest type are Howland Forest
US-Ho1, evergreen needleleaf forest, 45.2041∘ N,
68.7402∘ W;, Morgan Monroe State Forest
US-MMS, deciduous broadleaf forest, 39.3231∘ N,
86.4131∘ W; and Sylvania Wilderness
US-Syv, mixed forest, 46.2420∘ N,
89.3476∘ W;.
We chose temperate sites with little expected water-stress,
and with a ≤ 3 months of recorded
below-freezing soil temperatures.
These criteria reflect the current capabilities
of DALEC2, as hydrological processes are not explicitly portrayed in the model.
For each AmeriFlux site, we extract the corresponding MODIS LAI retrievals
from the MOD15A2 LAI 8-day version 005 1 km resolution
product (downloaded from the Land Processes Distributed Active Archive Centre
http://lpdaac.usgs.gov/): we only keep maximum quality flag data.
Standard deviations are provided for 1 km MODIS LAI retrievals,
however these (a) do not reflect the magnitude variability in uncertainty,
(b) often imply the existence of negative LAI observations
(σLAI>LAI) and (c) are occasionally missing. While
various MODIS LAI evaluations have been performed e.g., large-scale spatiotemporal LAI retrieval errors remain poorly
quantified. For the sake of simplicity, we assign a factor of 2 uncertainty
(i.e. log(LAI)±log(2)) for each MODIS LAI observation. To
minimize spatial discrepancies between MODIS and AmeriFlux sites, each LAI
observation is the arithmetic mean of all available LAI retrievals within a 9-pixel 3km×3km area (centred on each AmeriFlux
site).
Overall, we use 95, 120 and 119 LAI values
at US-Syv, US-Ho1 and US-MMS (5th–95th percentile ranges for LAI values are
0.4–5.8, 1.0–5.6 and 0.4–5.5 respectively).
For each site we extract total soil carbon density from the nearest
Harmonized World Soil Database 30 arc seconds resolution total soil carbon
content approx. 1 km at equator;: the
authors have performed multiple comparisons of the global HWSD against other
products, however no pixel-scale uncertainties are provided. We chose to
assign an uncertainty factor of 2 on each site-scale HWSD soil carbon estimate. The
HWSD soil carbon values are 2.3×104, 2.3×104 and 5.2×103 gCm-2.
To limit our study to the use of globally spanning data sets, we extract
DALEC2 drivers from 0.125 ∘ × 0.125 ∘ ERA
interim meteorology (see Appendix A for details). The DALEC2 analyses for
each site are therefore completely independent from all site-level
measurements (we note, however, that extensive meteorological and biometric
data are meticulously recorded across the AmeriFlux site network). Therefore,
we produce a fully independent ecosystem carbon cycle analysis, which can be
evaluated against measured NEE at each flux-tower site.
As done for the synthetic experiments, we perform the MHMCMC approach at each
site – with and without EDCs – and label the posterior parameter ensembles
(4 chains × 500 x samples) as xaSTA
(standard AmeriFlux MDF) and xaEDC (AmeriFlux MDF +
EDCs). We compare the DALEC2 NEE analyses, DALEC2(xaEDC)
and DALEC2(xaSTA) against NEE measurements at each
AmeriFlux site.
EDC sensitivity test
To determine the sensitivity of our results to EDCs 1–12, we repeat MDF estimates of
xsEDC and xaEDC by imposing only one EDC at a time (henceforth xsEDC(n) and xaEDC(n), where n is the nth EDC). For the synthetic experiments, we determine the relative contribution of the nth EDC by quantifying the overall EDC associated error reduction (IEDC(n), see Eq. ) for each estimate of xsEDC(n).
Given the large computational cost of estimating xsEDC(n)
for each EDC (40 synthetic experiments × 12 EDCs × 4 chains),
we limit our sensitivity analysis to IEDC estimates based on 4 (out of 40) synthetic experiments.
We compare 3 yr integrated DALEC2 NEE estimates and AmeriFlux NEE measurements at all three sites (AmeriFlux NEE measurement temporal gaps have been consistently excluded from DALEC2 3 yr NEE estimates).
We determine the DALEC2 3 yr
NEE 50% confidence range (50% CR: 25th–75th percentile interval) reduction
as follows:
(1-RNEE,EDC(n)RNEE,STA)×100%,
where RNEE,EDC(n) and RNEE,STA are the 50% CR of DALEC2(xaEDC(n)) and DALEC2(xaSTA) 3 yr NEE estimates. Similarly, we calculate the 3 yr NEE bias reduction (relative to AmeriFlux NEE measurements) as follows:
(1-|BNEE,EDC(n)||BNEE,STA|)×100%,
where BNEE,EDC(n) and BNEE,STA are the median biases of DALEC2(xaEDC(n)) and DALEC2(xaSTA) 3 yr NEE estimates.
Results
Synthetic experiments
Synthetic experiment parameter error reduction, and AmeriFlux experiment 3 yr NEE 50% CR and bias reduction for MDF estimates using individual EDCs, relative to the standard MDF estimates.
EDC
Synthetic experiment
AmeriFlux experiments
parameter error
bNEE 50% CR reduction (bias reduction)
reduction (aIEDC(n))
US-Syv
US-Ho1
US-MMS
1
-0%
27% (-11%)
19% (-13%)
3% (-11%)
2
-1%
39% (-26%)
29% (-25%)
14% (-19%)
3
0%
13% (-0%)
1% (3%)
0% (-7%)
4
-1%
30% (-14%)
22% (-14%)
9% (-17%)
5
8%
3% (-3%)
0% (-4%)
1% (-11%)
6
2%
10% (3%)
-2% (6%)
-1% (-3%)
7
-13%
-15% (52%)
-28% (76%)
-25% (95%)
8
3%
34% (-36%)
37% (-9%)
16% (-66%)
9
1%
-39% (89%)
-50% (57%)
-31% (100%)
10
2%
10% (19%)
6% (25%)
5% (18%)
11
-1%
10% (-0%)
1% (11%)
3% (1%)
12
2%
8% (-1%)
2% (0%)
3% (-6%)
ALL EDCs
34%
43%(69%)
48%(93%)
32%(93%)
a The parameter error reduction metric, IEDC(n), is described in Sect. 2.4.
b The derivations of 3 yr NEE 50 % CR and bias reductions are described in Sect. 2.6.
Aggregated parameter estimates xsSTA (standard
sampling, blue) and xsEDC (EDC sampling, red) from
deciduous and evergreen synthetic LAI and soil organic carbon observations –
these are compared against observation and EDC independent parameter samples
xsRAN (light grey). Normalized parameter space error (ϵ) probability density functions
for (a) Group A (directly inferable) parameters, (b) Group B (indirectly
inferable) parameters and (c) all DALEC2 parameters. ϵ values for
each parameter group were derived using Eq. (15). In panels (d) and (e) the
probability density functions of live biomass (foliar, labile, wood and roots) and dead
biomass (litter and soil carbon) biases against the synthetic truth
parameters s are shown for xsRAN,
xsSTA and xsEDC parameter estimates.
The inclusion of EDCs resulted in substantial error reductions in posterior
DALEC2 parameter and state variable estimates. We found an overall reduction
in the posterior MHMCMC EDC parameter vector errors
E(xsEDC), relative to both the standard MHMCMC
errors E(xsSTA) and the randomly sampled parameter
vector errors E(xsRAN): we found an improvement of
IEDC=34% associated with using EDCs
(Fig. c). For the directly constrained parameters
(parameter group A) we found similar distributions for both
E(xsSTA) and E(xsEDC)
errors relative to E(xsRAN) errors
(Fig. a), and similarly lower xsSTA and
xsEDC errors values relative to xsRAN
errors (Ẽ(xsSTA)=0.19,
Ẽ(xsEDC)=0.21,
Ẽ(xsRAN)=0.42, group A: IEDC=-6%). For the indirectly constrained parameters (group B), we found
significantly smaller xsEDC errors relative to
xsSTA and xsRAN
(Ẽ(xsEDC)=0.29,
Ẽ(xsSTA)=0.34, and
Ẽ(xsRAN) = 0.38), and hence improved
estimates of s when we implemented EDCs (group B: IEDC=88%, Fig. b).
We found that EDCs 5 and 8 accounted for the largest error
reduction in DALEC2 parameter estimates (IEDC(5,8)≥3%, Table 2),
followed by EDCs 6, 10 and 12 (IEDC(6,10,12)=2%). EDC 7 led to
an overall parameter error increase (IEDC(7)=-13%). The remaining EDCs
accounted for small or negative error reductions.
Three-year mean DALEC2 net ecosystem exchange (NEE) biases (relative to synthetic truth)
aggregated across 40 synthetic experiments at 0.5 gCm-2d-1 intervals.
The bias frequencies are shown for DALEC2(xsSTA) (standard sampling, blue)
and DALEC2(xsEDC) (EDC sampling, red) relative to the synthetic truth DALEC2(s) (black dashed line).
v
We compared EDC total xsEDC, xsSTA and
xsRAN live biomass (Croo+Cfol+Clab+Cwoo) and dead biomass (Csom+Clit) pool biases relative to their corresponding synthetic truths
(Fig. d–e). For dead biomass, both
xsEDC and xsSTA perform comparably better
than xsRAN (Fig. e), as dead biomass is
mostly accounted for by the synthetic Csom observations: the
xsEDC and xsSTA median bias factors (1.1,
0.91) are close to 1 (i.e. a bias of zero) relative to
xsRAN median bias factor (0.04). For live biomass pools,
xsEDC live biomass bias estimates are smaller than
xsSTA (Fig. d): the
xsEDC bias distribution (median = 1.20) is closer to 1
relative to the xsSTA bias distribution (0.48), with
respect to xsRAN median bias (0.20). For total biomass
estimates, we found similar bias distributions relative to
xsRAN (xsEDC median bias factor = 1.22,
xsSTA bias factor = 0.98): both bias factors are closer to
1 relative to xsRAN (bias factor = 0.16).
We found that incorporating EDCs resulted in a reduced mode and 90 %
confidence range (90% CR: 95th–5th percentile interval) for 3-year NEE
biases (Fig. ). We found a 65 % reduction in the
DALEC2(xsEDC) 3-year NEE bias 90 % CR
(9.0 gCm-2d-1), relative to the
DALEC2(xsSTA) 3-year NEE bias 90 % CR
(26.9 gCm-2d-1). The 3-year NEE bias modes for
DALEC2(xsEDC) and DALEC2(xsSTA) are
0.0 gCm-2d-1 and –0.5 gCm-2d-1 (at
0.5 gCm-2d-1 intervals).
AmeriFlux results
The DALEC2(xaEDC) analyses outperformed the standard
DALEC2(xaSTA) analyses at the AmeriFlux tower sites. The
inclusion of EDCs in DALEC2 analyses amounted to overall NEE bias reductions
at all sites (US-Syv, US-Ho1, US-MMS, we henceforth present all site results
in this order). The aggregated DALEC2(xaEDC) median daily
NEE biases (-0.02, 0.13, -0.03gCm-2d-1) are closer
to the AmeriFlux measured NEE by roughly one order of magnitude in contrast
to DALEC2(xaSTA) median NEE biases (-0.52, -0.86,
-1.15gCm-2d-1). The aggregated daily
DALEC2(xaEDC) NEE 90 % confidence ranges at each site
(10.9, 10.1, 8.3 gCm-2d-1) were all smaller
(53–87 %) than the corresponding DALEC2(xaSTA) NEE
bias 90 % CR (20.3, 18.3, 9.5 gCm-2d-1). The
reductions in bias are consistent across the 3-year comparison period at
each site (Fig. ).
DALEC2 daily NEE ensemble estimates at three AmeriFlux sites: Sylvania Wilderness (US-Syv, mixed
forest, top two rows), Howland Forest (US-Ho1, evergreen needleleaf, middle two rows), and Morgan Monroe
State Forest (US-MMS, deciduous broadleaf, bottom two rows). For each site the DALEC2(xaEDC)
and the DALEC2(xaSTA) ensemble confidence intervals are denoted as EDC and STA, respectively.
The DALEC2 analyses – based on MODIS LAI retrievals, HWSD soil organic carbon estimates and ERA interim meteorological drivers – are completely independent from all AmeriFlux site measurements.
Cumulative AmeriFlux NEE observations are compared against corresponding
DALEC2(xaSTA) and DALEC2(xaEDC) NEE
estimates (Fig. ); AmeriFlux NEE temporal gaps have
been omitted from both DALEC2 and AmeriFlux derived cumulative NEE time
series. DALEC2(xaEDC) integrated NEE estimates outperformed
DALEC2(xaSTA) NEE estimates at all three sites.
DALEC2(xaEDC) median NEE biases over the 3 yr
period (-0.26, 0.07, 0.08 kgCm-2) are smaller than the
equivalent DALEC2(xaSTA) biases (-0.84, -1.09,
-1.18 kg C m-2), with relative EDC bias reductions of 69 %,
93 % and 93 %. The inclusion of EDCs also resulted in a reduction in
NEE confidence intervals: DALEC2(xaEDC) 50 % CR (1.17,
1.57, 1.16 kgCm-2) are 32–48 % smaller than the
corresponding DALEC2(xaSTA) 50 % CR (2.04, 3.00, 1.70
kgCm-2).
Based on DALEC2(xaEDC(n)) 3 yr NEE estimates, EDC 10 resulted
in a ≥18% bias reduction and a ≥5% 50 % CR reduction at all three sites, relative to DALEC2(xaSTA) (Table 2).
EDCs 2 and 8 resulted in a >10% 3 yr NEE 50 % CR reduction and an increase in 3 yr NEE bias at all three sites (NEE bias reduction ≤-22%).
EDCs 7 and 9 resulted in a ≥50% 3 yr NEE bias reduction
and an increase in 3 yr NEE 50 % CR at all three sites (NEE 50 % CR reduction ≤-15%).
Three-year mean DALEC2 cumulative NEE (kgCm-2)
compared against cumulative measured NEE at three AmeriFlux sites: Sylvania
Wilderness (US-Syv, mixed forest, left), Howland Forest (US-Ho1, evergreen
needleleaf, middle) and Morgan Monroe State Forest (US-MMS, deciduous
broadleaf, right). The standard analysis median and 50 % confidence
ranges (CR) are shown in blue, and the corresponding analyses with EDCs are
shown in red. AmeriFlux NEE measurements are denoted as a black line. The
DALEC2 analyses – based on MODIS LAI retrievals, HWSD soil organic carbon
estimates and ERA interim meteorological drivers – are completely
independent of all AmeriFlux site measurements.
Discussion
With the use of a simple model and globally available data, i.e. leaf area
dynamics and soil carbon observations, we have demonstrated that the EDC
approach provides an improved ability to infer the magnitude of carbon
fluxes, live carbon pools and model parameters, in comparison to a standard
parameter optimization approach (STA).
For ecologically relevant synthetic truths, EDCs provide improved estimates
of the DALEC2 parameters and state variables. The EDC approach resulted in
(a) parameter estimation error reductions, (b) NEE bias and confidence range
reductions, and (c) improved estimates of the live biomass C pools, in
contrast to the STA parameter and flux and C pool estimates. While
there is little difference between directly inferable (Group A) estimated
parameter errors between the EDC and STA approach, using EDCs led to a marked
reduction in estimated parameter error for indirectly inferable (Group B)
parameters. The indirectly inferred parameters include allocation fractions,
subsurface pools and turnover rates, which are typically difficult to observe
at field sites and virtually impossible to observe remotely (i.e. at regional
scales).
By comparing DALEC2 analyses against independent AmeriFlux NEE measurements
over real ecosystems, we further validated the advantages of using EDCs. At
each AmeriFlux site, we found that EDCs led to an increased confidence and
a largely reduced NEE bias; our DALEC2 model analyses suggests that the use
of EDCs regionally and globally could significantly enhance our ability to
estimate ecosystem state variables in the absence of direct observational
constraints. In light of the large differences between Earth system models
, we anticipate that EDCs may help
constrain ecosystem carbon terms on global scales, where carbon pools and
their residence times are typically difficult or impossible to measure.
Together, EDCs 1–12 lead to overall improvements
in parameter estimates and AmeriFlux site NEE confidence range/bias (Table 2):
however, with the exception of EDC 10,
when EDCs were tested individually, they did not lead to comprehensive improvements.
For example, EDC 8 alone (no rapid exponential pool decay) resulted in
large AmeriFlux site NEE confidence range reductions, as well as improved synthetic parameter
estimates; however, EDC 8 resulted in higher AmeriFlux site NEE biases.
Conversely, EDC 9 (steady-state proximity of the soil carbon pool) resulted in
the largest AmeriFlux site bias reductions, while NEE confidence was lower.
EDC 5 (comparable fine root and foliar/labile allocation) led to the largest parameter improvements;
however, the associated changes in AmeriFlux site NEE estimates were relatively small.
Our findings demonstrate that robust improvements in carbon cycling parameter and state variable estimates only arise when EDCs are used collectively.
Here we developed a group of EDCs suitable to ecosystems with no recent major
disturbance. However, we note that our EDCs can be adapted for a wider range
of ecosystem dynamics. For example, recently disturbed ecosystems may be (a)
rapidly recovering and (b) growing towards a steady state where carbon pools
are greater than one order of magnitude from the initial carbon pools.
Therefore a subset of our EDCs (EDCs 7–12) can be adapted to better
represent ecological “common sense” in recovering ecosystems.
Ultimately, EDCs can be adapted to best represent ecological knowledge in
a variety of ecosystem carbon model MDF applications, where the ecosystem
observations are insufficient to constrain all model state variables (e.g.
Fox et al., 2009). For example, on regional and global spatial scales, there
is often no explicit knowledge on various model parameter values and their
associated uncertainty. In such cases, our EDC approach imposes
inter-parameter constraints while simultaneously allowing a global parameter
exploration across several orders of magnitude (see Table 1). Hence EDCs
allow us to incorporate ecologically consistent relationships between
parameters (i.e. allocation ratios, turnover ratios), without the need to
constrain otherwise unknown parameter and state variables. Moreover, as an
alternative to imposing plant-functional-type priors, which risk being
subjective and over-rigid, ecosystem trait inter-relationships derived from
plant trait data e.g. could be incorporated
as additional EDCs.
Given a quantitative knowledge of parameter inter-relationships,
we also note that
a prior parameter variance–covariance structure – in addition to EDCs –
can also be used as an alternative or complementary constraint on the model state and parameters.
Finally, we note that our choice of EDCs is open to
adaptation and adjustment: we maintained relatively broad constraints (e.g.
EDC 6 permissible root:foliar C range > one order of magnitude),
which can likely be refined through further study.
In this study we limited our observational constraints to globally spanning
MODIS LAI retrievals and the HWSD soil map.
Given these two data sets, we have demonstrated that EDCs lead to improved model
parameter estimates and reduced NEE bias and confidence ranges. Nonetheless,
based on the posterior NEE probability density function,
we are unable to determine whether sites are net carbon sinks or sources on
annual timescales.
However, an increasing number of
continental and global scale biospheric data sets are becoming available:
these include a global canopy height map by , pan-tropical
biomass maps by and a pan-boreal carbon
density map by . These products can potentially be used in
conjuncture with MODIS LAI, HWSD data and our EDC approach in a MDF framework
to better constrain terrestrial carbon cycle dynamics.