Evaluation of wetland methane emissions across North America using atmospheric data and inverse modeling

Existing estimates of methane ( CH 4 ) ﬂuxes from North American wetlands vary widely in both magnitude and distribution. In light of these differences, this study uses atmospheric CH 4 observations from the US and Canada to analyze seven different bottom-up, wetland CH 4 estimates reported in a recent model comparison project. We ﬁrst use synthetic data to explore whether 5 wetland CH 4 ﬂuxes are detectable at atmospheric observation sites. We ﬁnd that the observation network can detect aggregate wetland ﬂuxes from both eastern and western Canada but generally not from the US. Based upon these results, we then use real data and inverse modeling results to analyze the magnitude, seasonality, and spatial distribution of each model estimate. The magnitude of Canadian ﬂuxes in many models is larger than indicated by atmospheric 10 observations. Many models predict a seasonality that is narrower than implied by inverse modeling results, possibly indicating an over-sensitivity to air or soil temperatures. The LPJ-Bern and SDGVM models have a geographic distribution that is most consistent with atmospheric observations, depending upon the region and season. These models utilize land cover maps or dynamic modeling to estimate wetland coverage while most other models rely primarily on 15 remote sensing inundation data.


Introduction
CH 4 fluxes from wetlands play a critical role in global climate change.CH 4 is the second most important long-lived greenhouse gas, and the radiative forcing of the current atmospheric burden is approximately 26 % of carbon dioxide (Butler, 2014).Wetlands are possibly the largest single source of this gas to the atmosphere and account for roughly 30 % of global emissions (Kirschke et al., 2013).
Despite the important role of wetland CH 4 fluxes in climate change, existing estimates of this source differ on the magnitude, seasonality, and spatial distribution of fluxes, from regional to global scales.In fact, a recent global model comparison project named WETCHIMP (Wetland and Wetland CH 4 Intercomparison of Models Project) found large differences among existing CH 4 wetland models (Fig. 1, Melton et al., 2013;Wania et al., 2013).For example, existing estimates of maximum global wetland coverage differ by about a factor of 6 -from 4.1 × 10 6 to 26.9 × 10 6 km 2 .Furthermore, estimates of global natural wetland fluxes range from 92 to 264 Tg CH 4 yr −1 .The relative magnitude of these uncertainties increases at sub-global spatial scales; CH 4 estimates for Canada's Hudson Bay Lowlands (HBL) range from S. M. Miller et al.: Wetland methane estimates over North America 0.2 to 11.3 Tg CH 4 yr −1 .These disagreements in current CH 4 estimates do not bode well for scientists' abilities to accurately predict future changes in wetland fluxes due to climate change (e.g., Melton et al., 2013).
A number of studies have used chamber measurements of CH 4 to parameterize or evaluate biogeochemical CH 4 models (e.g., Livingston and Hutchinson, 2009).These measurements usually encompass fluxes from a relatively small area, and fluxes can often vary greatly with landscape heterogeneity at these spatial scales (Waddington and Roulet, 1996;Hendriks et al., 2010).CH 4 data collected in the atmosphere observe the cumulative effect of CH 4 fluxes across a broader region (e.g., Winderlich et al., 2010;Pickett-Heaps et al., 2011;Bruhwiler et al., 2014;Miller et al., 2014).Hence, atmospheric data can provide a unique tool for evaluating existing CH 4 flux estimates across different countries or continents.
The present study compares the WETCHIMP CH 4 flux estimates against atmospheric CH 4 data and inverse modeling results from 2007-2008 through two sets of analyses.First, we construct a set of synthetic data experiments to understand whether the atmospheric CH 4 observation network can detect CH 4 fluxes from wetlands.We also explore the factors that might prevent the network from detecting wetland fluxes.To answer these questions, we utilize a model selection procedure based upon the Bayesian information criterion (BIC; Sect.2.2, Shiga et al., 2014;Fang et al., 2014;Fang and Michalak, 2015).This procedure determines whether wetland fluxes from different regions and seasons are necessary to describe variability in synthetic atmospheric CH 4 observations.Based on these synthetic experiments, we conduct a second set of analyses using real atmospheric data and inverse modeling results.We use these data to analyze the magnitude, seasonal cycle, and spatial distribution of each WETCHIMP CH 4 estimate.We investigate these questions over the US and Canada, using CH 4 data collected from towers and regular aircraft flights operated by NOAA and its partners and from towers operated by Environment Canada.

Methods
This section first describes the atmospheric CH 4 data and the atmospheric model that allows direct comparison between the data and various flux estimates.Subsequent sections describe both the synthetic and real data experiments outlined in the introduction (Sect.1).

Data and atmospheric model
The present study utilizes atmospheric CH 4 observations from aircraft and tower platforms across the US and Canada, a total of 14 703 observations from 2007-2008.These observation sites include 4 towers operated by Environment Canada and 10 towers in the US operated by NOAA and its partners.Observations at the NOAA towers consist of daily (occasionally weekly) flasks, and observations at the Environment Canada sites are continuous measurements.As in Miller et al. (2014), we use afternoon averages of these continuous data.In addition to these towers, we utilize observations from 17 regular NOAA aircraft monitoring locations in the US and Canada (Fig. 2).We incorporate aircraft data up to 2500 m altitude as was done in Miller et al. (2013).Observations above that height are usually representative of the free troposphere with limited sensitivity to surface fluxes.These observations and the associated model runs (described below) are the same as those used in Miller et al. (2013) and Miller et al. (2014).
We then employ an atmospheric transport model to relate CH 4 fluxes at the Earth's surface to atmospheric concentrations at the observation sites.The modeling approach here combines the Weather Research and Forecasting (WRF) meteorological model and a particle-following model known as STILT, the Stochastic Time-Inverted Lagrangian Transport model (e.g., Lin et al., 2003;Nehrkorn et al., 2010;Hegarty et al., 2013).WRF-STILT generates a set of footprints; these footprints quantitatively estimate the sensitivity of each observation to fluxes at each surface location (with units of ppb per unit surface flux).We multiply the footprints by a flux model and add this product to an estimate of the "background" concentration -the CH 4 concentration of air entering the North American regional domain.We estimate this background concentration using CH 4 observations collected near or over the Pacific Ocean and in the high Arctic, a setup described in detail by Miller et al. (2013) and Miller et al. (2014).The resulting modeled concentrations can be compared directly against atmospheric CH 4 observations.The observations, WRF-STILT runs, background concentrations, and uncertainties in the modeling framework are described in greater detail in the Supplement, Miller et al. (2013), andMiller et al. (2014).
We can then estimate atmospheric concentrations using fluxes from the WETCHIMP project (Fig. 1) and compare those estimates against atmospheric observations.The WETCHIMP project was designed to compare simulated wetland distributions and modeled CH 4 fluxes at multi-year, continental scales (Melton et al., 2013;Wania et al., 2013).The project entailed several sets of model runs, but Melton et al. (2013) primarily reported on one set of runs -runs for 1901-2009 that used the same observed climate and CO 2 concentration data sets to force all models.Each CH 4 model utilized its own parameterization for wetland area and distribution.We use the outputs from this set of model runs in the present study.Of the WETCHIMP models, seven provide a flux estimate on a suitable time step for boreal North America and six provide an estimate for temperate North America.These models include CLM4Me (Riley et al., 2011), DLEM (Tian et al., 2010), LPJ-Bern (Spahni et al., 2011), LPJ-WHyMe (Wania et al., 2010), LPJ-WSL (Hodson et al., 2011), ORCHIDEE (Ringeval et al., 2010), and SDGVM  (Pan et al., 2010).Larger dots indicate tower and aircraft sites with regular observations over the 2-year period (Andrews et al., 2014).The grey background delineates the four regions used in the synthetic data experiments (Sect.2.2). (Singarayer et al., 2011).All flux model outputs used from the WETCHIMP study have a temporal resolution of 1 month, and we regrid all outputs to a spatial resolution of 1 • lat.by 1 • long.(the resolution of the WRF-STILT footprints).These models are described in Melton et al. (2013), Wania et al. (2013), and the Supplement.

Synthetic data experiments
We assess the ability of the CH 4 observation network to detect wetland fluxes using a model selection framework adapted from the BIC.A model selection framework can sort through a large number of potential explanatory variables and will choose the smallest set of variables that best describe the data set of interest (e.g., Ramsey and Schafer, 2012) We use a form of the BIC that has been adapted for use within a geostatistical inverse modeling framework.This setup has previously been used to select either bottom-up models or environmental drivers of CO 2 and CH 4 fluxes (e.g., Mueller et al., 2010;Yadav et al., 2010;Gourdji et al., 2012;Miller et al., 2013Miller et al., , 2014;;Shiga et al., 2014;Fang et al., 2014;Fang and Michalak, 2015).The implementation here mirrors that of Fang et al. (2014), Shiga et al. (2014), and Fang and Michalak (2015): The first two terms in Eq. (1) are the negative log-likelihood, a measure of how well the model fits the data.The last term penalizes a particular model based upon the number of explanatory variables (p).The best combination or candidate model has the lowest BIC score.
The variable z (n × 1) represents the observations minus background concentrations and H (n × m) the footprints (where m refers to the total number of flux or emissions grid boxes in both space and time).These variables are based upon two existing inverse modeling studies by Miller et al. (2013Miller et al. ( , 2014) ) (refer to the Supplement).The matrix X (m×p) contains p explanatory variables.In the current setup, X can include a wetland flux estimate and/or individual emissions sources from an anthropogenic inventory.β (p ×1) is a set of coefficients that scale the variables in X.We set these coefficients to 1 in the synthetic data experiments.As a result, the model selection framework cannot reproduce wetland fluxes by simply upscaling anthropogenic emissions sources that might have a similar distribution to wetlands.Lastly, (n×n) is a covariance matrix derived from an atmospheric inversion framework.This covariance matrix represents errors in atmospheric transport and in the measurements -collec-tively referred to as model-data mismatch.This matrix also represents uncertainties in the prior flux estimate.In a geostatistical inverse model, this prior flux model is given by Xβ (refer to the Supplement for more detail).
The first experiments described here use synthetic atmospheric CH 4 data.We generate the synthetic data using one of the WETCHIMP models and the anthropogenic emissions estimates from Miller et al. (2013Miller et al. ( , 2014)).We then multiply these fluxes by the footprints (H ) and add error that is randomly generated from the covariance matrix ( ).
Before generating the synthetic data, we scale the annual HBL CH 4 budget in each WETCHIMP model to match the overall magnitude estimated by several top-down studies (Pickett-Heaps et al., 2011;Miller et al., 2014;Wecht et al., 2014).If we did not downscale the magnitude of the WETCHIMP models, the wetland fluxes would be a much larger source relative to anthropogenic emissions and modeling and/or measurement errors.The synthetic data experiments would identify wetlands too easily, would understate the relative role of model and/or measurement errors, and would not be representative of the atmospheric methane observations.
We divide the WETCHIMP wetland fluxes into four regions (Fig. 2) and four seasons (DJF, MAM, JJA, and SON).The model selection framework then chooses variables that are necessary to reproduce the synthetic data, variables that include EDGAR and the 16 wetland flux maps.The penalty term in Eq. ( 1) increases as we add wetland flux maps or add EDGAR to the X matrix.Each variable added to X will increase the penalty term by ln(n); an additional variable must improve the log-likelihood by more than this penalty term to be chosen by model selection.
We then run this framework 1000 times, generating new synthetic data each time, and calculate the percentage of all trials in which the model selection chooses a wetland model.The 1000 repeats are needed due to the random or stochastic nature of the synthetic data experiment; the results of the model selection can vary slightly, depending on the particular random errors that we generate based upon the covariance matrix ( ).This procedure ensures that the model selection results are not the output of a single realization.We then report on how frequently each of the 16 wetland flux maps is chosen by the BIC-based model selection.If a wetland flux map is chosen with high frequency, then a wetland flux map is necessary to describe variability in the synthetic CH 4 observations, and the synthetic observation network can detect aggregate wetland CH 4 fluxes from the given region and season.This setup mirrors that of Shiga et al. (2014), who employed a model selection framework to explore the detectability of anthropogenic CO 2 emissions.
We also explore why the synthetic CH 4 observations may not be able to detect wetland fluxes.We run a series of case studies and in each case remove a different confounding factor that might prevent the network from detecting wetland CH 4 fluxes.In one case, we remove anthropogenic emis-sions.In subsequent cases, we remove model-data mismatch errors and/or prior flux errors.In each case, we rerun the model selection experiment and examine whether the results improve when each of these confounding factors is removed.

Real data experiments
This paper subsequently compares the spatial distribution, magnitude, and seasonality of each WETCHIMP estimate against real atmospheric CH 4 observations, using the synthetic experiments to guide the analysis.
We first explore the spatial distribution of the WETCHIMP flux estimates.We modify the model selection setup in Sect.2.2 to focus on the spatial distribution of each estimate using a procedure developed by Fang et al. (2014) and Fang and Michalak (2015).Instead of fixing the coefficients (β) to 1, we instead estimate the coefficients using real atmospheric CH 4 observations.We also include an intercept term that can vary by month; the intercept for each month is represented by a vector of ones in the matrix X, and this intercept is included as part of each candidate model for X.We then run model selection using real observations.As a result of this setup, a wetland model is not necessary to reproduce either the magnitude or seasonality of the atmospheric CH 4 data; the model selection framework can simply scale the intercept term or scale EDGAR to reproduce the magnitude or seasonality of the observations.The spatial distribution of wetland fluxes, however, can only come from a wetland model.The model selection procedure will only choose a wetland model if the spatial distribution of that model describes sufficient additional variability in the observations (e.g., Fang et al., 2014).
Model selection can therefore indicate which WETCHIMP models have the best spatial distribution relative to the atmospheric observations; any WETCHIMP model chosen by model selection has a spatial distribution that improves model-data fit, and the model improves that fit more than the penalty term in Eq. (1).A negative result does not necessarily indicate that a WETCHIMP model has a poor spatial distribution.In that case, the observations may not be very sensitive to the spatial distribution of fluxes for the given region or given season.Similarly, the spatial distribution in a WETCHIMP model may improve model-data fit but not by more than the penalty term in Eq. (1).By contrast, a positive result indicates that a WETCHIMP model likely has a particularly good spatial distribution.As in Sect.2.2, we divide the wetland fluxes into four sub-continental regions and four seasons.The Supplement describes this setup in greater detail.
We then analyze the magnitude and seasonality of the WETCHIMP fluxes using a number of model-data time series.We model CH 4 concentrations at a number of US and Canadian observation sites using the WRF-STILT model, the WETCHIMP estimates, and the EDGAR v4.2FT2010 emissions inventory (Olivier and Janssens-Maenhout, 2012; Eu-ropean Commission, Joint Research Centre , JRC).We average the observations and model output at the monthly scale and then compare the magnitude of these model estimates for each month against the averaged observations.
We further compare the seasonality of existing bottom-up models against the seasonality of a recent inverse modeling estimate by Miller et al. (2014).We plot the monthly budgets for both the bottom-up models and the inversion estimate, and we plot the monthly CH 4 budget as a fraction of the annual total.
Note that inter-annual variability in existing CH 4 flux models is small relative to the differences among these models; as a result, conclusions from the 2-year study period (2007)(2008) likely hold for other years.For example, the inter-annual variability in the total US and Canadian budget is ±7.3-9.7 % (standard deviation), depending upon the model in question (note that LPJ-Bern has even larger interannual variation due to an issue with model spinup described in Wania et al., 2013).

Results and discussion: synthetic experiments
The synthetic experiments presented here explore the limits of existing atmospheric data for constraining wetland fluxes.If atmospheric observations are to constrain wetland CH 4 fluxes, those observations must be able to detect wetland CH 4 fluxes above errors in the transport model and above other emissions sources such as fossil fuels and agriculture.
The four columns in Fig. 3a display the results from an individual season in each of four geographic regions.In this experiment, the synthetic CH 4 observations can detect aggregate wetland CH 4 fluxes from eastern Canadian wetlands in greater than 75 % of all trials for the summer and fall seasons.In the eastern US, the model selection framework chooses a wetland model in 25-50 % of all trials in multiple seasons.By contrast, the synthetic CH 4 data are least sensitive to wetland fluxes in the western US, and the model selection framework chooses wetland fluxes from that region in fewer than 25 % of all trials irrespective of the season.That result may be due, in part, to the scant wetlands and sparse atmospheric observations in much of the west.
The results also vary by season.Of any region, the atmospheric CH 4 network is best able to constrain fluxes across multiple seasons in eastern Canada.The largest wetland fluxes in the WETCHIMP models are in Ontario and Quebec (Fig. 1).It is therefore unsurprising that the network is best able to detect wetland fluxes in that region, even though there are relatively few observation sites in the area.In other regions, the atmospheric CH 4 network is less sensitive to wetlands during the winter, fall, and spring shoulder seasons.
We run several additional model selection experiments to explore why the synthetic observations may not always be able to detect wetland CH 4 fluxes (Fig. 3b-e).We remove anthropogenic emissions from the synthetic data set for the experiment in Fig. 3b.We remove all model-data mismatch errors in Fig. 3c; model-data mismatch encompasses errors in atmospheric transport and in the measurements.Subsequently, we remove all errors due to the prior flux estimate in Fig. 3d.In Fig. 3e, we remove both types of errors.In each case, we rerun the model selection experiment to see if the sensitivity of the atmospheric CH 4 network to wetland fluxes improves.Anthropogenic emissions have only a modest effect on the results in specific regions and seasons.In experiment b (Fig. 3b) without anthropogenic emissions, the results improve by ∼ 25-50 % in the fall and spring shoulder seasons for several geographic regions.
By contrast, the model-data mismatch and prior flux errors have a much larger effect on the model selection results.The results improve incrementally across many regions and seasons when we remove model-data mismatch errors in experiment c.The results improve across the spring, summer, and fall seasons and improve across all four geographic regions.However, the magnitude of this improvement is never more than 25 %.Model-data mismatch errors are likely dominated by errors in modeled atmospheric transport.These results imply that transport errors play an incremental yet pervasive role in the utility of the atmospheric observations.
The prior flux errors have the largest effect on the results, particularly during the warmest seasons.In experiment d, the results show great improvement during fall, spring, and summer and show little improvement during winter or in the western US.In the setup here, the prior flux uncertainties scale with the seasonal magnitude of the fluxes.When we remove the prior flux errors, the results concomitantly show the greatest improvement in seasons that have larger overall CH 4 fluxes.These results indicate that the prior estimate greatly impacts the utility of the atmospheric CH 4 observations.A geostatistical inverse model can leverage any combination of land surface maps, meteorological maps, and/or anthropogenic inventory estimates in the inversion prior.These maps or estimates are incorporated into the X matrix in Eq. ( 1).If accurate maps or estimates are not available, then the prior flux errors will be large, and the model selection framework will be less likely to choose any particular variable.If these maps or estimates have high explanatory power, then the prior flux errors will be small, and the model selection framework will be more likely to choose any one variable.As a result, the detectability of wetland CH 4 fluxes partly depends on the availability of land surface or meteorological data that match those fluxes.The atmospheric network can differentiate wetland CH 4 fluxes from other CH 4 sources better when accurate prior information can guide that identification.
Experiment e (no model-data mismatch errors and no errors in the prior flux estimate) shows large, ubiquitous improvements; the model selection chooses a wetland model 100 % of the time in almost all regions and seasons.The results for eastern Canada during winter are the exception.In winter, the wetland model cannot always explain enough variability in the synthetic observations to overcome the BIC penalty term in Eq. ( 1).
The density of the atmospheric CH 4 network may also play a role in these results.Wetlands in the eastern US are sparse relative to eastern Canada, but the higher density of observations in the eastern US may contribute to a moderate success rate (25-50 %) for that region.Recent and planned network expansions in the eastern US and in Canada could play a key role in future efforts to constrain wetland fluxes across these regions.
Overall, the synthetic experiment results indicate that the observation network cannot detect wetland fluxes from the US (i.e., model selection has a success rate < 50 %).Across Canada, the results are more promising (i.e., near 100 % success rate in some regions and/or seasons), despite the relative sparsity of the observation network there.

Spatial distribution of the fluxes
We compare the spatial distribution of the WETCHIMP flux estimates against CH 4 data from the atmospheric observation network.To this end, we use a version of the model selection framework that chooses wetland models based upon their spatial distribution (Fang et al., 2014;Fang and Michalak, 2015).WETCHIMP models that are chosen by the frame- The results of this model selection analysis are displayed in Table 1.This table lists the regions and seasons that had a success rate > 50 % in the synthetic data experiment; the atmospheric CH 4 network is most sensitive to wetland CH 4 fluxes in those regions and seasons.Two of the WETCHIMP models were chosen by the model selection framework -LPJ-Bern (in eastern Canada) and SDGVM (in eastern and western Canada).The spatial distribution of these models improve the model-data fit more than the penalty term in Eq. ( 1).
The LPJ-Bern and SDGVM models have several unique spatial characteristics that could explain these results.Over eastern Canada, LPJ-Bern and SDGVM concentrate the large fluxes in the HBL.Other models, by contrast, often distribute the fluxes more broadly across Ontario and Quebec or put the largest fluxes in Ontario outside of the HBL.In western Canada, SDGVM distributes fluxes across northern, boreal Saskatchewan and Alberta.
The LPJ-Bern and SDGVM models share another common characteristic: both model wetland area independently instead of relying solely on remote sensing inundation data sets.LPJ-WSL, ORCHIDEE, DLEM, and CLM4Me use remote sensing inundation data sets like GIEMS (Global Inundation Extent from Multi-Satellites, Prigent et al., 2007) to construct a wetland map.Other models, like LPJ-Bern and LPJ-WHyMe also use land cover maps and/or land surveys to estimate wetland (or at least CH 4 -producing) area.SDGVM estimates this area dynamically as a function of soil moisture (Melton et al., 2013;Wania et al., 2013).Wetland maps generated using these different approaches show substantial differences.Remote sensing data sets estimate relatively high levels of inundation in regions of Canada that are not forested or have many small lakes (see further discussion in Melton et al., 2013;Bohn et al., 2015).By contrast, modeling approaches that dynamically generate wetland area or use land cover maps assign more wetlands over regions with high water tables but little surface water as seen by remote sensing based inundation data sets.As a result of these differences, models like LPJ-Bern assign more wetlands and CH 4 fluxes in the HBL relative to other regions of eastern Canada.
Of note, LPJ-Bern and LPJ-WhyMe have many structural model similarities but predict relatively different spatial dis-tributions of CH 4 fluxes.The latter estimates fluxes that are more broadly distributed across Quebec and Labrador.LPJ-WhyMe only simulates fluxes from high latitude peatlands and uses an estimated peatland distribution from Tarnocai et al. (2009); this distribution extends across Quebec and Labrador.LPJ-Bern, by contrast, includes fluxes from nonpeatland regions and applies a smaller scaling factor to peatland fluxes relative to LPJ-WHyMe (Wania et al., 2013).As a result, the fluxes in LPJ-Bern have a spatial distribution that is different from the peatland map and also different from LPJ-WHyMe.

Flux magnitude
Next, we compare the magnitude of predicted concentrations using the WETCHIMP models against atmospheric observations at individual locations.Unlike previous sections that utilized model selection, this section employs several modeldata time series, displayed in Fig. 4. The model estimates in Fig. 4 consist of several components: the background (in green) is the estimated background concentration of CH 4 in clean air before entering the model domain as in Miller et al. (2013Miller et al. ( , 2014)).The estimated contribution of anthropogenic emissions from EDGAR v4.2FT2010 is added to this background (in red).The contribution of wetland fluxes from the WETCHIMP models is then added to the previous inputs, and the sum of all components (blue lines) can be compared directly against measured concentrations.
The various WETCHIMP flux estimates produce very different modeled concentrations at the atmospheric observation sites (Fig. 4).Overall, modeled concentrations with the WETCHIMP fluxes usually exceed the CH 4 measurements during summer.At Chibougamau, Fraserdale, and Park Falls in early summer, all seven WETCHIMP models predict CH 4 concentrations that equal or exceed the observations.The ORCHIDEE, LPJ-WHyMe, and LPJ-Bern models always exceed the measurements during summer while DLEM and SDGVM match the observations better at these sites.Notably, a number of previous studies report that the EDGAR inventory may underestimate US anthropogenic CH 4 emissions (e.g., Kort et al., 2008;Miller et al., 2013;Wecht et al., 2014;Turner et al., 2015).If EDGAR underestimates emissions, then the WETCHIMP models would be an even larger overestimate relative to the atmospheric data.
Many models appear to overestimate the magnitude of fluxes across boreal North America, but this result does not necessarily imply that these models have underestimated fluxes elsewhere in the world.CH 4 models that estimate the largest fluxes across boreal North America do not always compensate with smaller fluxes in other regions of the globe.For example, the ORCHIDEE model not only estimates large fluxes over North America but also estimates higher fluxes over the tropics than any other model (Melton et al., 2013).

Seasonal cycle
Bottom-up CH 4 flux estimates show variable features when compared to atmospheric observations, and the seasonal cycle of these estimates is no exception.Figure 5 compares the seasonal cycle of the existing estimates over Canada's HBL.Eastern Canada is one of the largest wetland regions in North America (Fig. 1), and nearby atmospheric observation sites see a much larger CH 4 enhancement from wetlands relative to other regions (Fig. 4 and S4).
In this region, the bottom-up estimates diverge on the seasonal cycle of fluxes.Most estimates predict peak fluxes in July or August, though two variations of the LPJ model predict seasonal peaks in September and October -LPJ-WHyMe and LPJ-Bern, respectively.LPJ-WHyMe is a module inside of LPJ-Bern, a possible explanation for the similar seasonal cycle in these two models.Differences among models are also notable during the fall and spring seasons.For example, fluxes in June account for anywhere between 6 and 21 % of the annual CH 4 budget, depending upon the model.Fluxes in October account for between 1 and 23 % of the annual budget (Fig. 5b).
Figure 5 also displays the seasonality of an inverse modeling estimate from Miller et al. (2014)   ferences between this inverse modeling estimate and the WETCHIMP models often exceed the 95 % confidence interval of the inverse model.The WETCHIMP estimates are often comparable to Miller et al. (2014) in magnitude during fall and spring months but exceed the inverse modeling estimate in summer months (Fig. 5a).On whole, the WETCHIMP models have a narrower relative seasonal cycle than the inverse modeling estimate (Fig. 5b).That estimate assigns a greater portion of the annual budget to the fall and spring shoulder seasons.
Additional top-down studies exist for the HBL, but those studies use a seasonal cycle drawn from an existing bottomup model and do not estimate the seasonal cycle independently from CH 4 observations (Pickett-Heaps et al., 2011;Wecht et al., 2014;Turner et al., 2015).By comparison, a recent inverse modeling study of the western Siberian lowlands found parallel results for that region -existing models also predict a seasonal cycle that is narrower than the seasonality implied by atmospheric observations (Winderlich, 2012;Bohn et al., 2015).
Numerous possible explanations could underly differences in the seasonal cycle of CH 4 fluxes.For example, the temperature threshold for CH 4 production may be too high in some models.Relative to summer months, the bottom-up models predict small fluxes during fall and/or spring months when air temperatures are near freezing but soils are still unfrozen (Fig. S3 in the Supplement).According to estimates from the North American Regional Reanalysis (NARR, Mesinger et al., 2006), surface soils in the HBL (0 and 10 cm depth) begin to thaw in April and are largely unfrozen in May (Fig. S3).In the fall, surface soils (0 cm depth) begin to freeze in November, but deeper soils (10 and 40 cm) remain largely unfrozen until December.Compared to the bottom-up models, the inverse modeling estimate predicts a wider seasonal window, a result that would be consistent with dates of deep soil freeze and thaw.

Conclusions
A recent model comparison study revealed wide differences among several estimates of wetland CH 4 fluxes.This study uses atmospheric data and inverse modeling to evaluate those differences across North America.In the first component of this study, we use a synthetic data experiment to understand whether the atmospheric observation network can detect wetland CH 4 fluxes.We find that the network can reliably identify aggregate wetland fluxes from both eastern and western Canada.The network can detect wetland fluxes from the eastern US in a smaller fraction of trials and rarely from the western US.This analysis also accounts for distracting signals in the atmosphere from anthropogenic sources or simulated atmospheric transport errors.
In a second component of the study, we analyze each bottom-up CH 4 model from the WETCHIMP study using real atmospheric data.We find that the LPJ-Bern and SDGVM models have spatial distributions that are most consistent with atmospheric observations, depending upon the region and season of interest.In addition, almost all models overestimate the magnitude of wetland CH 4 fluxes when compared against atmospheric data at individual observation sites.The model ensemble may also estimate a seasonal cycle for eastern Canada that is too narrow (i.e., place too much of the total annual flux in the summer relative to the fall and spring shoulder seasons).
The results of this paper suggest possible pathways to improve future top-down estimates of wetland CH 4 fluxes.The ability of the atmospheric observation network to detect wetland fluxes depends largely upon the prior flux model.In a geostatistical inverse model, this model can incorporate land surface maps -wetland maps, estimates of land surface processes, and maps of anthropogenic emissions sources.This information plays a large role in whether atmospheric observations can detect wetland fluxes; the observations can more adeptly identify wetland fluxes when accurate land surface maps are available to guide that identification.By contrast, atmospheric transport and measurement errors (i.e., modeldata mismatch errors) have a ubiquitous but smaller effect on the utility of atmospheric CH 4 observations.
The results presented here also hold a number of suggestions for future bottom-up modeling efforts: 1. Spatial distribution: bottom-up estimates that use surface water inundation as the only proxy for wetland area do not perform as well relative to atmospheric observations.Bottom-up models that use satellite inundation data should incorporate additional tools like wetland mapping or dynamic modeling to capture wetlands covered by vegetation.
2. Magnitude: existing top-down studies that use a diverse array of in situ and satellite CH 4 observations show good agreement on the magnitude of CH 4 fluxes from the Hudson Bay Lowlands region (e.g., Pickett-Heaps et al., 2011;Miller et al., 2014;Wecht et al., 2014;Turner et al., 2015).These studies could be used to calibrate the magnitude of future bottom-up estimates, at least over the HBL where CH 4 observations provide a strong constraint on wetland fluxes.
3. Seasonal cycle: bottom-up models do not show consensus on the seasonal cycle of wetland fluxes across Canada.Few top-down studies estimate the seasonal cycle independently using atmospheric observations.Additional top-down studies would indicate the range of seasonal cycle estimates that are consistent with atmospheric observations, particularly studies that use a diverse set of atmospheric models and/or diverse observational data sets.These efforts could help reconcile differences in the seasonal cycle among bottom-up models and between bottom-up models and the few, existing top-down studies.
These steps will hopefully lead to better convergence among wetland CH 4 estimates for North America.
The Supplement related to this article is available online at doi:10.5194/bg-13-1329-2016-supplement.

Figure 1 .Figure 2 .
Figure 1.Mean of the annual methane fluxes estimated by the WETCHIMP models (a) and the range of fluxes estimated by the ensemble (b).Note that the range in estimates is larger than the mean.The fluxes shown above are the average flux per m 2 of land area, not per m 2 of wetland area.

Figure 3 .
Figure 3.This figure displays the results of the synthetic data experiments.These experiments examine whether the observation network can detect aggregate wetland CH 4 fluxes.The figure shows the percentage of trials that are successful.Darker shades indicate that the network is more sensitive to wetland fluxes in the given region and season.Panel (a) shows the results for the standard setup while panels (b-e) show the results of several test cases in which anthropogenic emissions or different errors are set to zero.

Figure 4 .
Figure 4.These time series compare atmospheric methane measurements at several observation sites against model estimates using the WETCHIMP ensemble and the EDGAR v4.2FT2010 anthropogenic emissions inventory.Refer to Fig. S4 for model-data time series at additional sites, particularly sites that are distant from large wetlands.

Figure 5 .
Figure 5.The seasonal cycle in methane fluxes estimated for the HBL (50-60 • N, 75-96 • W).We include both the WETCHIMP estimates and an inverse modeling estimate from Miller et al. (2014).Panel (a) displays the monthly budget from each estimate while (b) displays each month as a percentage of the annual budget estimated by a given model.

Table 1 .
Spatial flux patterns chosen by the model selection framework.