Data-based estimates of the ocean carbon sink variability – first results of the Surface Ocean p CO 2 Mapping intercomparison ( SOCOM )

Using measurements of the surface-ocean CO2 partial pressure (pCO2) and 14 different pCO2 mapping methods recently collated by the Surface Ocean pCO2 Mapping intercomparison (SOCOM) initiative, variations in regional and global sea–air CO2 fluxes are investigated. Though the available mapping methods use widely different approaches, we find relatively consistent estimates of regional pCO2 seasonality, in line with previous estimates. In terms of interannual variability (IAV), all mapping methods estimate the largest variations to occur in the eastern equatorial Pacific. Despite considerable spread in the detailed variations, mapping methods that fit the data more closely also tend to agree more closely with each other in regional averages. Encouragingly, this includes mapping methods belonging to complementary types – taking variability either directly from the pCO2 data or indirectly from driver data via regression. From a weighted ensemble average, we find an IAV amplitude of the global sea–air CO2 flux of 0.31 PgC yr (standard deviation over 1992–2009), which is larger than simulated by biogeochemical process models. From a decadal perspective, the global ocean CO2 uptake is estimated to have gradually increased since about 2000, with little decadal change prior to that. The weighted mean net global ocean CO2 sink estimated by the SOCOM ensemble is−1.75 PgC yr (1992–2009), consistent within uncertainties with estimates from ocean-interior carbon data or atmospheric oxygen trends. Published by Copernicus Publications on behalf of the European Geosciences Union. 7252 C. Rödenbeck et al.: An ensemble of pCO2-based sea–air CO2 flux estimates


Introduction
The global ocean acts as a major sink for anthropogenic carbon, and thereby helps to slow down the human-induced warming of the Earth's climate (Stocker et al., 2013).Presently, approximately 27 % of the annually emitted carbon is taken up by the ocean (Le Quéré et al., 2015); in total 30 % of the anthropogenic carbon emitted since the industrialization of our planet has been stored by the ocean (Sabine et al., 2004;Khatiwala et al., 2013).Thus, variations in the oceanic carbon sink, in particular a possible decline under climate change, co-determine the future climate trajectory.In addition to this direct relevance, present-day variations in the sea-air CO 2 exchange, when related to possible driving factors, can be employed to provide information on the underlying mechanisms of ocean biogeochemistry.
Until recently, estimates of the oceanic CO 2 uptake rate and its variability were largely based on (1) ocean biogeochemical process models (see e.g.Wanninkhof et al., 2013), (2) inverse estimates based on atmospheric CO 2 data (see Peylin et al., 2013), or (3) inverse estimates based on oceaninterior carbon data (Gloor et al., 2003, and subsequent refinements).However, while process models are useful tools to study the sensitivity of carbon fluxes to the physical and biogeochemical mechanisms that control them, they are not specifically designed for state estimation and thus have large uncertainties if used in this way (Wanninkhof et al., 2013).Likewise, while atmospheric CO 2 inversions are able to provide estimates of land-air CO 2 exchange on large scales, their sea-air CO 2 flux estimates suffer from large relative errors over most of the ocean due to the dominance of land variability in the atmospheric signals (Peylin et al., 2013).Finally, while ocean-interior inversions offer a strong databased constraint on the long-term flux in larger regions, they do not provide flux variability or finer spatial detail.
A more direct quantification of the sea-air CO 2 flux is possible using measurements of the oceanic and atmospheric partial pressures of CO 2 (pCO 2 ) in conjunction with a parameterization of the gas transfer across the sea-air interface.Through extensive concerted community efforts, more than 10 million surface ocean pCO 2 measurements were gathered and recently compiled into the SOCATv2 (Surface Ocean CO 2 Atlas version 2, Bakker et al., 2014) and LDEOv2013 (Lamont-Doherty Earth Observatory version 2013, Takahashi et al., 2014a) databases.
Although pCO 2 data have thus been available in nearly all ocean basins for several decades, observations from ships or fixed sensors can necessarily only cover a tiny fraction of the spatio-temporal pCO 2 field of the global surface ocean.Therefore, to obtain continuous sea-air CO 2 flux fields over larger areas or the entire ocean, interpolation (gap-filling) methods are needed to estimate values in all the periods and areas not directly observed.Various methods have been proposed to interpolate pCO 2 data in space and time (Appendix A).They span a wide range of approaches, in par- In well-constrained areas/periods, they closely follow the signals contained in the data, while in areas/periods far from neighbouring data points, they remain essentially unconstrained.Regression methods establish a quantitative relation between pCO 2 and a set of external variables assumed to capture the major modes of spatio-temporal variability.Adjustable degrees of freedom are constant in time and within certain spatial regions, such that data gaps can be filled according to the spatio-temporal structure in the external variables; however, variability not contained in any of the chosen external data sets cannot be reproduced.Non-linear regression methods (feed-forward neural networks, self-organizing maps) essentially do not impose any structure onto this relation between pCO 2 and the drivers.(Multi-)linear regression imposes a linear relationship, thereby restricting the type of responses but ensuring a unique and mathematically well-defined solution.Finally, knowledge of biogeochemical processes can be brought to bear by regression of pCO 2 against fields simulated by a biogeochemical process model, or by tuning initial conditions or parameters in such a model simulation to match the observations.However, this relies heavily on the structure of the process simulation to be correct.
ticular with respect to the information sources tapped and assumptions imposed.Due to that, some methods are able to reproduce the signals in the data more closely, while others are able to bridge the data-void areas/periods more effectively (Fig. 1).These complementary characteristics of the various approaches to some degree reflect differing targets of the individual studies.Correspondingly, their strengths and weaknesses can be expected to vary depending on the given purpose.However, this complementarity offers a great opportunity for robustness assessment, as the existence of common features in the results of mapping methods based on different principles gives strong support to the estimates.In periods or areas without data, this is the only available way to assess uncertainties.Furthermore, we can investigate the information content of the various data streams used by some methods and not used by others.It is the primary objective of the Sur-face Ocean pCO 2 Mapping intercomparison (SOCOM) initiative to foster such inter-method investigations.SOCOM is not meant to rank methods but to exploit the added value of their complementarity.Ultimately it aims to identify which features of the surface-ocean pCO 2 field (and consequently the sea-air CO 2 flux) can be robustly inferred from the available surface-ocean carbon data, and to provide quantitative estimates for these features, including an uncertainty assessment.These sea-air CO 2 flux estimates based on surfaceocean carbon data are then available to feed into comprehensive carbon cycle syntheses like the REgional Carbon Cycle Assessment and Processes (RECCAP) activity of the Global Carbon Project (http://www.globalcarbonproject.org/ reccap/), which until recently mainly had to rely on model simulations for variability.
This paper first introduces the ensemble of data-driven pCO 2 mapping methods currently available in the SOCOM initiative (Sect.2), and gives an overview of the estimated seasonality and interannual variability (IAV) in oceanic "biomes" (Sect.4.1).As some of these pCO 2 data-driven methods have been used to assess interannual variations of global sea-air CO 2 fluxes in recent carbon budgets by the Global Carbon Project (GCP; Le Quéré et al., 2015), we then specifically analyse the interannual variations in the sea-air CO 2 fluxes.Focus is put on the consistency between regressing and non-regressing methods, and on the amplitude of the interannual sea-air CO 2 flux variability (Sect.4.2).

Mapping methods
This section provides an overview of the principles of the various mapping approaches and the range of particular choices taken within each method class.
Details on the individual mapping methods (referenced by labels in italics) are given in Appendix A and the references cited there.Essential properties and technical parameters are summarized in Tables 1-3.In particular, Table 3 gives the spatial and temporal coherence scales of the adjustable degrees of freedom, determining the balance between the ability of a method to bridge data gaps and its ability to directly follow the observed signals (see table footnote e).Table 4 indicates which modes of pCO 2 variability are, by construction of the individual methods, estimated from the information contained in the pCO 2 data (rather than prescribed or determined in other ways).For a summary of method classes, see Fig. 1 and its caption.

Statistical interpolation
Statistical interpolation schemes fit the data to suitable auto-regressive models.The applied auto-correlation scales have either been determined from the pCO 2 data themselves (UEA-SI, OceanFlux-SI), chosen to reflect data density (Jena-MLS), or derived from empirical orthogonal func-tion (EOF) analysis of an ensemble of process model simulations (CU-SCSE).The interpolation is either done directly for the pCO 2 field (UEA-SI, OceanFlux-SI, CU-SCSE) or indirectly for the field of ocean-internal carbon sources and sinks determining the pCO 2 field (Jena-MLS).
In most statistical interpolation schemes, those pixels/time steps that are neither directly constrained by co-located data nor indirectly constrained by sufficiently close data (within the spatial or temporal correlation scales) fall back to some "background state" or "prior", namely, the estimated mean seasonality and estimated trend (UEA-SI), parametrized temperature-related variations (Jena-MLS), or a prescribed climatology plus a prescribed linear trend (CU-SCSE).The ordinary block kriging used in OceanFlux-SI does not use a priori data values and interpolates the data to any distance, though the estimation uncertainty increases with interpolation distance.

Linear regression
(Multi-)linear regression (AOML-EMP, UEx-MLR, JMA-MLR) expresses pCO 2 as a linear combination of a set of one or more driving variables (such as sea surface temperature (SST), sea surface salinity (SSS), mixed-layer depth (MLD), chlorophyll a, etc.), and adjusts their multipliers so as to best match the pCO 2 observations.The calculation is done separately for each of a set of spatio-temporal domains.Individual implementations differ in the set of chosen driver variables, as well as in the choice of spatio-temporal domains over which the same adjustable multipliers are used.

Non-linear regression
The forms of the non-linear regression technique currently applied to map the sea surface pCO 2 are self-organizing maps (SOM; NIES-SOM) and feed-forward networks (FFN; NIES-NN, CARBONES-NN), as well as combinations of SOM and FFN (ETH-SOMFFN) or SOM and linear regression (UNSW-SOMLO).
-Self-organizing maps (SOM) project (multidimensional) driver variables to a two-dimensional discrete space of clusters ("neuron cells").Observed pCO 2 values are then assigned to the clusters according to their associated driver variable values.With this information, spatio-temporal pCO 2 maps are created by finding neuron cells with similar driver variable values for any given location/time step, and using the associated pCO 2 value there.
-Feed-forward networks (FFN) establish a non-linear statistical relationship between a set of driver variables and the pCO 2 observations (training), and apply this relationship to continuous fields of the driver variables to create a continuous pCO 2 map (prediction).
a Even if designated "global", most methods exclude some coastal areas or the Arctic, or treat coastal areas as open ocean.
As for linear regression, the individual implementations differ in the set of chosen physical or biogeochemical driver variables (SST, SSS, MLD, Chl a, etc.).Different choices have also been made concerning spatialization: while some implementations use independent neural networks within predefined spatial or spatio-temporal regions, others use one global network but add spatial or temporal coordinate variables to the set of drivers.Non-linear regression methods have the advantage over linear regressions that they can flexibly represent a wide class of pCO 2 -driver relationships.On the other hand, FFNs involve the risk that the non-linear extrapolation into datasparse regions might become unstable and produce outliers.SOMs avoid this risk, though instead their discrete output may contain spatial discontinuities.

Model-based regression and tuning
Although biogeochemical simulation models can successfully be tuned to reproduce WOCE-era transient tracer inventories (Matsumoto et al., 2004), this does not ensure skill in simulating trends and interannual variability, as tuning itself can in some instances merely be compensating for improper process representation or insufficient parameterizations.Data assimilation or non-linear inverse modelling efforts such as ECCO have been demonstrated to improve the representation of the evolving physical state of the ocean (Wunsch et al., 2009).Although promising, the incorporation of biogeochemistry into a consistent assimilation or inversion framework is still in the early stages of development.Within the methods collated here, biogeochemical ocean process models have been used in the following ways.
-Modelled pCO 2 fields have been split into different timescales (seasonality, interannual variations) and scaled so as to optimally match the pCO 2 data (PU-MCMC).
-Boundary conditions and initial fields of dissolved inorganic carbon (DIC) are tuned during the model run itself so as to optimally match the pCO 2 data (NIES-OTTM).
3 Analysis methods

Ensemble collection
The pCO 2 fields estimated by the various methods were regridded by each provider to a resolution of 1 • latitude ×1 • longitude and monthly time steps, preferably by averaging (if the original resolution is higher) or sub-sampling (if the original resolution is lower).Also, a sea mask (map of covered ocean area, possibly fractional) was requested from each provider.All subsequent processing was done by common scripts.

Spatial gap filling
Most methods do not cover the entire ocean surface (see Fig. A6).In particular, coastal areas or the Arctic are excluded in many methods.Some methods depending on satellite-derived chlorophyll a input data exclude some highlatitude areas during the dark season.OceanFlux-SI misses all locations/months where the satellite-derived SST input data are invalid.UEx-MLR has occasional invalid pixels for numerical reasons.These invalid pixels would pose severe problems to the ensemble analysis because (1) spatial averages (Sect.3.3) would not extend over the same area, causing spurious differences between the methods, and (2) the calculated sea-air CO 2 fluxes (Sect.3.6) would miss parts of the ocean.Restricting the comparison to the common ocean surface would only partially solve (1) and not solve (2).
Therefore, we filled any pixels in the pCO 2 maps that are not covered by the considered mapping method (according to its sea mask or its value being outside 0 < p CO 2 < 10 6 µatm) but are ocean (according to bathymetry taken from the ETOPO surface elevation data; US Department Commerce, 2006) by a common standard pCO 2 field.This standard field is the sum of the monthly climatology by Takahashi et al. (2014b) plus the year-to-year atmospheric pCO 2 increase (the year-to-year atmospheric pCO 2 increase is derived from observed atmospheric CO 2 mixing ratios by the Jena CO 2 inversion s85_v3.5 (as in Rödenbeck et al., 2013); we use a 12-month running mean of the atmospheric pCO 2 minus its mean in 2005, the year of the  (2014b) climatology).The filled pixels do not change the results strongly compared to signal size.

Biome averages
In this overview of the ensemble of mapping methods, we consider time series of pCO 2 averaged over the 17 biomes of Fay and McKinley (2014, Fig. 2, Table 5).We use the timeindependent "mean biomes", such that no spurious common variability can be induced from changing averaging domains.These biomes were chosen as they were derived from coherence in sea surface temperature (SST), spring/summer chlorophyll a concentrations (Chl a), ice fraction, and maximum mixed-layer depth, and thus may reflect areas of rela-  5 for biome names.
To filter for interannual variations (IAV), we consider 12month running means.

Time periods
Results are plotted over the respective valid period of each method.Statistical analyses are restricted to the 1992-2009 period, when results of most mapping methods are available, and when the data coverage is relatively good (this refers in particular to the equatorial Pacific).

Mismatch time series
As a first-order performance diagnostic, we compare the mapping results to the monthly observed values in the SOCATv2 gridded product (Sabine et al., 2013;Bakker et al., 2014) (unweighted averages -variable FCO2_AVE_UNWTD of file SO-CAT_tracks_gridded_monthly_v2.nc).We look at mapminus-data differences averaged over biomes, or over biomes and years.These biome or biome/year averages are taken only over those pixels/months that are covered by data, and with at least 400 m water depth to avoid coastal data (these coastal data may otherwise dominate the diagnostics as the methods do not take the special environment along the coasts into account).Spatial averages are further restricted to the valid area of each method; this may slightly favour methods with less surface coverage, because fewer data pixels are then included in the mismatch.
In addition to averaged map-data differences, we also consider time series of corresponding selective averages of the pCO 2 maps themselves sampled at the data locations/times.

The relative IAV mismatch R iav
As an overall measure of the mismatch between a given mapping product and the data with respect to interannual variations in a given biome, we use the amplitude of the average difference between the map and the comparison data.(1) Averages of the map-data difference are taken over biomes and years, restricted to data-covered open-ocean pixels/months as described in Sect.3.5.1.(2) A mismatch amplitude M iav is calculated as the temporal standard deviation of these biome/yearly average differences over the 1992-2009 analysis period (if a method does not cover all of this analysis period, statistics are calculated for a correspondingly shorter period (Table 2), despite the slight inconsistency due to IAV).
(3) To be able to put these mismatch amplitudes M iav into perspective, we similarly determine the mismatch amplitude M iav benchmark of "benchmark" fields where any oceanic IAV has been removed.The benchmark maps have been created from the mean seasonal cycle of the respective original maps.As the missing pCO 2 increase would cause unduly large mismatches between the benchmark and the data, we added the year-to-year atmospheric pCO 2 increase, which is suitable as it has negligible interannual variations compared to oceanic pCO 2 ; we use the same atmospheric increase based on atmospheric CO 2 data as used to fill invalid pixels (Sect.3.2).( 4) We then obtain a relative IAV mismatch for the given method and biome as It states by how much an estimate fits the data better due to its interannual variations, compared to a state of "no knowledge" about IAV.Alternatively, Eq. ( 1) can be seen as a normalization of the IAV mismatch to signal size: as the benchmark fields do not contain any IAV, their mismatch amplitudes M iav benchmark reflect the IAV in the data (influences of variations in data density will affect M iav and M iav benchmark in similar ways).Calculating the benchmark from each product's own seasonal cycle ensures a criterion comparable between the mapping methods (though the seasonal cycles are quite similar for all methods anyway; see Sect.4.1.1below).
It is difficult to decide which R iav values can be regarded as sufficient for IAV to be represented in a given map.For this paper, we present all IAV results that manage to stay below 75 %.This is an ample threshold, but in the light of possible ambiguities in the R iav calculation we prefer it over a stricter selection.To nevertheless make the likely range visible, we de-weight results with higher R iav by smaller line thickness in all time series plots.
To verify that the selection criterion is not unduely biased by the fact that some methods use SOCAT data and others use LDEO data (Table 3), IAV mismatch diagnostics have also been calculated from the LDEOv2013 database (Takahashi et al., 2014a, monthly binned), which is used as a data source by some mapping methods.LDEOv2013 shares large parts of data points with SOCATv2.Mismatch values are slightly different depending on the database, but are qualitatively consistent.

The relative monthly mismatch R month
An overall measure of mismatch on the monthly timescale is calculated analogously to Sect.3.5.2,except that monthly mismatches are used rather than yearly averaged mismatches, and that the benchmark is the year-to-year atmospheric increase without any seasonality.Thus this measure is mainly sensitive to the seasonal cycle as the largest monthto-month feature.

Sea-air flux calculation
Sea-air CO 2 flux fields f have been calculated from the pCO 2 fields by with piston velocity k (employing the widely used quadratic dependence on wind speed as in Wanninkhof (1992) but scaled globally according to Naegler (2009), and reduced to 10 % over ice as in Takahashi et al., 2009), water density , CO 2 solubility L, and atmospheric CO 2 partial pressure pCO 2 atm .The values of these auxiliary fields have been calculated from various data sets (e.g.NCEP wind speeds from Kalnay et al., 1996, OAFlux SSTs and ice cover from Yu and Weller, 2007) as in Rödenbeck et al. (2013,  the uncertainties in the flux parameterization do not enter the comparison considered here. As for pCO 2 , we consider the flux averaged over biomes or the global ocean.Interannual flux variations are again calculated as 12-month running means.Their amplitude A iav is measured as a temporal standard deviation of the yearly flux over the 1992-2009 analysis period.From the amplitudes A iav i of the individual mapping methods, we calculate an ensemble mean inversely weighted by the relative IAV mismatches R iav i (for methods with R iav i < 75 %): (3) Methods not covering the full analysis period are discarded in this average as there would be significant spurious changes in the amplitude if any of the El Niño anomalies in 1992 or 1997 was not included.

Results and discussion
We first provide an overview of the estimated seasonal and interannual variations in oceanic biomes (Sect.4.1), and the ability to estimate them from pCO 2 data and available mapping methods.We then discuss interannual variations in the sea-air CO 2 flux in more detail (Sect.4.2).

Seasonality
As an introductory example, we first consider surface ocean pCO 2 averaged over the North Atlantic Subtropical Permanently Stratified biome, which belongs to the relatively wellobserved regions and shows a pronounced seasonal cycle in pCO 2 (Schuster et al., 2013).Figure 3a shows monthly pCO 2 time series from the whole ensemble.For clarity of details, three arbitrary years have been selected.The results of the mapping methods generally agree with each other in terms of the mean and the seasonal cycle to within about 10 µatm. Figure 3b compares the mapping results to the SOCATv2 monthly gridded observations.To this end, mapping results have been averaged only over those locations/times where SOCATv2 comparison data exist.As these are the locations/times where (most of) the estimates are directly constrained, the mapping results generally follow the data closely, and the ensemble spread is often smaller than in panel a.In some months (e.g.September 2003 or July 2004) these selective averages deviate considerably from the whole-biome average, likely reflecting spatial sampling biases in the presence of spatial pCO 2 gradients.In such months, the ensemble spread tends to be higher than in months less affected by sampling biases.To objectively compare our results to the in situ data, we calculate the average difference between the mapped pCO 2 (at the data location) and the SOCATv2 monthly gridded values (panel c).In general, differences of the monthly values lie within about ±10 µatm.NIES-OTTM deviates farther, likely because this approach is strongly determined by the modelled seasonal cycle and thus does not follow the data more closely.
Biogeosciences, 12, 7251-7278, 2015 www.biogeosciences.net/12/7251/2015/Time series for the complete set of biomes are given in the Appendix.In terms of seasonality, the mapping methods show similar phasing and amplitude in almost all extratropical biomes (Fig. A1), with few exceptions mainly in the North Atlantic Subpolar Seasonally Stratified biome and the Southern Ocean.The spread in the North Atlantic is somewhat surprising given the relatively good data coverage.Possibly this area has larger spatial heterogeneity not adequately represented by (some of) the methods.NIES-OTTM shows a seasonal cycle opposite to the other methods, a behaviour present in many biogeochemical process models at high latitudes (Valsala and Maksyutov, 2010;Schuster et al., 2013).Methods agree on smaller seasonal amplitude in the tropics, though substantial differences in amplitude and phase exist.

Interannual variability (East Pacific Equatorial biome)
Interannual variability is exemplified by the East Pacific Equatorial biome, which is also relatively well observed, and features large coherent interannual variations in pCO 2 associated with the ENSO cycle (e.g.Feely et al., 1999 show different finer-scale features.Despite this biome-wide difference, averages at data-constrained pixels only (Fig. 4b) mostly are much more consistent between methods.This is expected as this selective average excludes all the gap-filled pixels where values naturally depend much more on the applied mapping method.Most strikingly, in the data-poor periods up to 1988, regression and interpolation methods (as far as they cover these periods) strongly differ in the wholebiome average (panel a), while they more closely agree at the data-covered pixels (panel b).This illustrates that the statistical interpolation methods solely rely on the pCO 2 data constraint, while regression methods bridge data gaps as their variability originates from the driver data that are available throughout time.In the more data-rich periods (since about 1992 in this biome), interpolation and regression methods do agree in many features even in the whole-biome average (panel a).Due to the complementary origin of the variabil-ity in these method classes (Fig. 1), this agreement confirms that, at least in this biome, (1) sufficient interannual information is contained in the available pCO 2 observations (in the more densely sampled period), and (2) the signals provided through the driver data of the regression methods largely capture the essential modes of interannual pCO 2 variability.Note that the selective average over data-covered pixels (panel b) also leads to temporal features very different from the full average (e.g. the peak in 2001), revealing sampling biases that alias seasonal variations and spatial gradients into the yearly/spatial average due to sampling that is not fully representative.These sampling biases pose the most prominent challenge to all the mapping methods.
Panel c shows the biome/yearly average difference between the interpolated pCO 2 fields and the SOCATv2 monthly gridded data set (Sect.3.5), reflecting the mismatch of mean, trend, and interannual variations (the sampling biases mentioned before should largely cancel out in this difference).Most mapping methods have a temporal mean mismatch (bias) of less than a few µatm.The year-to-year mismatches are of different magnitudes for the individual mapping methods (note that the larger mismatches in 2009/10 occur in a period of very few data points and may not be representative).Though the estimated interannual features can only be trusted if the year-to-year mismatches are small (a necessary condition), small year-to-year mismatches are not yet a sufficient condition for correct interannual variations: even if the available data points are fitted well, the extrapolation to data-void areas can be wrong ("over-fitting"; see more discussion in Sect.4.2 below).Therefore, we stress that the mismatch amplitudes are not meant to represent a detailed ranking of quality of the methods.Nevertheless, we take it as an encouraging finding that mapping methods with smaller IAV mismatch (e.g.passing the more strict relative IAV mismatch criterion of R iav < 30 % [Jena-MLS, ETH-SOMFFN]) are also closer to each other in the whole-biome average (panel a).Even this stricter selection comprises methods regressing or not regressing pCO 2 against external drivers, i.e. complementary ways of extrapolating to data-void areas/periods (Fig. 1, Table 3 table footnote e).This reinforces conclusions (1) and ( 2) above and confirms that meaningful interannual estimates can be achieved from the available pCO 2 data and mapping methods in the equatorial Pacific.

Interannual variability (other biomes)
All mapping methods agree that the East Pacific Equatorial biome considered before (Sect.4.1.2) has the largest interannual variability of all biomes (Fig. A2).The other biomes have much less interannual variability, leaving the rising trend (similar to the atmospheric CO 2 increase) as the most prominent interannual feature.There is one mapping method (NIES-OTTM) without a trend, a feature not however supported by the data (see the large data mismatch with a systematic trend in Fig. A3).Except for the West and East Pacific Equatorial biomes, the small year-to-year variations around the rising trend are not generally consistent between the mapping methods (ensemble spread similar to or larger than the variations themselves).Overall, mean mismatches (biases) are in the order of 3-4 µatm in all biomes (Fig. A3).As the mismatches do not consistently rise or fall over time, they confirm the esti-mated pCO 2 trends (except for NIES-OTTM that does not have the rising trend in pCO 2 ).The year-to-year mismatches have amplitudes of 3-4 µatm in some methods, but also mismatches as large as or larger than the interannual variations for other methods (R iav > 75 %, dashed lines).Except for the North Atlantic Subtropical Seasonally Stratified biome, each ocean region has at least some mapping methods with relative IAV mismatch below 60 or even 30 %, including both interpolation methods as well as linear and non-linear regressions.Methods tying IAV to process model simulations (PU-MCMC, NIES-OTTM) often have large relative IAV mismatches, except for PU-MCMC in the Northern Pacific biomes.

Sea-air CO 2 flux variability
In order to link the estimated pCO 2 variability to variability of sea-air CO 2 exchange as considered for the Global Carbon Project (GCP; Le Quéré et al., 2015), we calculated sea-air CO 2 fluxes f , using the same gas exchange formulation for each mapping method (Sect.3.6).

The East Pacific Equatorial biome
We first consider again the East Pacific Equatorial biome identified above as the biome with the largest interannual variability.Fig. 5a provides its sea-air CO 2 fluxes estimated by eight selected mapping methods (having relative IAV mismatch R iav < 75 % for biome-averaged pCO 2 ).The year-toyear flux variations are mainly driven by the pCO 2 variability (compare to Fig. 4a).Again, interannual features are largely similar between the mapping methods in this biome, but differ in their amplitudes (Fig. 5b).There is some tendency that the mapping methods with smaller IAV mismatch show larger interannual amplitudes.Strikingly low interannual variability is found in UEA-SI, while fitting the data with R iav = 52 % better than various other methods.This method moves away from the estimated mean seasonality only in the close vicinity of the data points, as justified by the short auto-correlation lengths of near-simultaneous pCO 2 levels found in the pCO 2 data (Jones et al., 2012).It thus gives a lower bound of IAV secured by the data information (Jones et al., 2015).As interannual features can be assumed to be more spatially coherent than features on the timescale of ship cruises (especially in the equatorial Pacific), the low IAV amplitudes by UEA-SI are likely an underestimate.

The global ocean
Figure 5c provides global sea-air CO 2 fluxes estimated by 10 selected mapping methods (having relative IAV mismatch R iav < 75 % for global pCO 2 ).These mapping methods mostly agree in their decadal variations, with a pronounced decadal enhancement in ocean CO 2 uptake after the year 2000, preceded by a period of little decadal change or rather weakening uptake (see Fig. A7).This confirms a fea- 2) plotted against the relative IAV mismatch amplitude R iav i for each submission (cases not fully covering the analysis period have been omitted to avoid inconsistencies).The weighted mean A iav (Eq. 3) is given as a horizontal line.

Biogeosciences
ture also simulated by process models (see Fig. 7 of Le Quéré et al. (2015) and the discussion in Sect.3.6 of Rödenbeck et al., 2014).One of the areas contributing to this change in decadal trends is the Southern Ocean, where Landschützer et al. (2015) found consistency of decadal trends between ETH-SOMFFN and Jena-MLS, having relatively low R iav values there.
There is less agreement in the sub-decadal variations of the global sea-air CO 2 flux, despite the much closer mutual agreement of the same mapping methods in the wellconstrained East Pacific Equatorial biome (Fig. 5a).This lower agreement reflects the more uncertain flux contributions from the poorly data-constrained areas.For example, the larger sub-decadal variations by Jena-MLS to a large extent originate from the South Pacific Subtropical Permanently Stratified biome (Fig. A4, panel "Biome 7"), which is a data-poor region and therefore may receive spurious variability from the equatorial Pacific extrapolated too far south (indeed, the amplitude of the variations decreases with shorter latitudinal extrapolation radius [latitudinal a priori correlation length], Sect.3.3 of Rödenbeck et al., 2014), though according to the R iav criterion these larger variations match the data better than the smaller variations.Another contributor of sub-decadal Jena-MLS variability is the Pacific sector of Biome 16: In the Southern Ocean, essentially only two areas (south of New Zealand and south-west of Patagonia, respectively) are data-covered for multiple years, such that signals from there are extrapolated into their data-void surroundings.Due to this low data coverage, the Southern Ocean biomes 15 and 16 also contribute considerably to the ensemble spread in general (Fig. A4).Unfortunately, the absence of data also means that we cannot validate or falsify the different extrapolations.In summary, despite the success in constraining CO 2 fluxes in the equatorial Pacific from available data and mapping methods (Sect.4.2.1),estimates of year-to-year variations in the global sea-air CO 2 flux face larger uncertainties due to the undersampled regions.Despite these differences in the detailed variations, the amplitude of global flux IAV (Sect.3.6) is relatively consistent (panel d).The global weighted ensemble mean A iav (Eq. 3) is 0.31 PgC yr −1 (horizontal line in panel d).Many biogeochemical process models have less variability than that (mean of 0.20 PgC yr −1 in Le Quéré et al., 2015) and thus likely underestimate IAV in the ocean carbon sink (compare Séférian et al., 2014;Turi et al., 2014) IAV (Peylin et al., 2013), reflecting the fact that they can constrain land variability but less so ocean variability.
Though the primary strength of the pCO 2 constraint lies in its information on temporal variations and smaller-scale spatial variations, we also consider the long-term mean global sea-air CO 2 exchange.The total mean flux (comprising both uptake induced by anthropogenic atmospheric CO 2 rise and natural river-induced outgassing) estimated by the different methods ranges between −1.36 PgC yr −1 and −1.96 PgC yr −1 (for the 1992-2009 analysis period), with a weighted ensemble mean (analogous to Eq. 3 but using the inverse mean pCO 2 bias as weights) of −1.75 PgC yr −1 .This is consistent within uncertainties with the independent estimate from inverting ocean-interior carbon data of −1.7 PgC yr −1 (Gruber et al., 2009) nominally for 1995.Subtracting a river-carbon-induced outgassing flux of 0.45 PgC yr −1 (Jacobson et al., 2007), the ensemble mean corresponds to an anthropogenic CO 2 uptake of −2.2 PgC yr −1 .This is again consistent within uncertainties with the estimate from the globally integrative constraint by the atmospheric O 2 and CO 2 trends of −2.2 ± 0.6 PgC yr −1 given by Manning and Keeling (2006) for the slightly different 1993-2003 period.

Conclusions
Measurements of surface-ocean pCO 2 , mapped into continuous space-time fields, offer a much more direct way to quantify sea-air CO 2 fluxes and their variations than previously available approaches (model simulations, atmospheric inversions, ocean-interior inversions).Taking advantage of an ensemble of 14 partially complementary surface-ocean pCO 2 mapping methods recently collated by the SOCOM initiative, we analysed sea-air CO 2 flux variability globally and for a subdivision of the ocean into 17 biomes (Fay and McKinley, 2014).This study has found the following.
-Surface-ocean pCO 2 data together with mapping methods constrain the seasonality of regional pCO 2 essentially in all ocean biomes (mostly within 10 µatm).
-Interannual variations of regional pCO 2 are constrained at least in the more densely observed ocean regions (tropical Pacific, parts of the northern temperate Pacific and Atlantic).The tropical Pacific is consistently estimated as the biome with the largest interannual variations, with reduced CO 2 uptake during El Niño periods.The global ocean CO 2 uptake is estimated to have gradually increased since about 2000, with little decadal change prior to that.
-Interannual variations in the global sea-air CO 2 flux are estimated to have an amplitude of 0.31 PgC yr −1 (average across mapping methods weighted according to IAV mismatch).Therefore, most biogeochemical process models appear to significantly underestimate this variability (Le Quéré et al., 2015, quote a model-derived amplitude variation of 0.2 PgC yr −1 ).
-Though the primary strength of the pCO 2 constraint lies in its information on temporal variations and smallerscale spatial variations, the estimated net integrated global sea-air CO 2 flux of −1.75 PgC yr −1 (weighted ensemble mean) is consistent within uncertainties with the independent estimates based on inverting oceaninterior carbon data and on atmospheric O 2 and CO 2 trends.
For forthcoming analyses involving data-based sea-air CO 2 flux products, we recommend -if possible -using several interpolation products, or at least testing the robustness of the features under consideration by checking the consistency between several products.In particular, agreement between complementary mapping methods taking variability either from driver data or directly from pCO 2 data (Fig. 1), as found here for the interannual variations in the tropical Pacific, lends great support to the estimated features, as it shows consistency between different information sources.
However, the mapping products should be selected carefully and weighted according to suitable performance diagnostics, to ensure their suitability in a given purpose.The presented "relative IAV mismatch" criterion provides a necessary condition for IAV applications.Analogous "relative mismatch" criteria can also be defined and calculated for other timescales.However, as discussed in the paper, it would be even better to use sufficient conditions (e.g.derived by testing the power of the mapping methods to reconstruct modelled pCO 2 fields from pseudo data subsampled as the real data).Such sufficient conditions are not yet available for the SOCOM ensemble, but are planned in forthcoming studies.
SOCOM does not identify an "optimal" mapping method or method class.We also discourage any ensemble averaging (or medians, etc.) of full spatio-temporal fields or time series, as this would result in variations that are not selfconsistent any more and fit the data less well than individual products.Only for scalar statistical quantities of the spatiotemporal fields, such as amplitudes of variation, correlation coefficients, etc., may it make sense to summarize the ensemble into averages of these quantities, weighted according to the above-mentioned performance diagnostics.
Many of the pCO 2 mapping products are updated when new data sets become available, and the mapping methods are subject to further development.The SOCOM intercomparison may serve to stimulate such developments, though results should not be assessed in terms of their position in the ensemble, but only in terms of objective criteria.At the website http://www.bgc-jena.mpg.de/SOCOM/we aim to provide an updated list of products and ensemble analyses.SO-COM welcomes further members contributing estimates of the spatio-temporal pCO 2 field or the sea-air CO 2 flux based on surface-ocean carbon data.The basis of all mapping products considered here are extensive pCO 2 observations over many years.Even when external information is used to bridge data gaps (Fig. 1), a minimum amount of data in time or within areas of similar biogeochemical behaviour is indispensable.Missing data may not only lead to missing out on existing features, but even to creation of spurious features due to sampling biases.Though the exact limits to interpolation capacity can only be determined through targeted studies (e.g. by running interpolation schemes only on part of the data and then comparing to the other part), this study already shows that (1) with realistic sampling efforts (e.g. in the abovementioned well-constrained regions) and available mapping methods, constraining pCO 2 variability is possible (as in Fig. 5a, Sect.4.2.1), but (2) undersampled regions limit our current ability to determine the global total flux in its finer detail (Fig. 5c, Sect.4.2.2).This highlights the high priority that should be given to sustaining the ongoing sampling and to closing observational gaps.As many of the undersampled regions are not well accessible by ships, autonomous sampling devices, such as BioARGO floats (Claustre et al., 2010), seem indispensable as an additional observation component.In addition to the actual measurements, the use of pCO 2 observations in regional and global sea-air CO 2 flux products also depends on the continuation of all the efforts to quality-control the data and to provide them in a consistent and user-friendly form.Method description: the approach combines temporal interpolation through curve fitting (one to four seasonal harmonics and a linear trend - Masarie and Tans, 1995;Schuster et al., 2009) and spatial interpolation using the concept of spatial de-correlation lengths, or a "radius of influence", interpolating data based on the likely similarity between spatially separated points (Cressman, 1959;Levitus, 1982).In addition, cubic spline fitting is used to move away from the fitted mean seasonal cycle to incorporate interannual variations where data points exist.The de-correlation scales applied in the interpolation are determined from the autocorrelation characteristics of the pCO 2 data along ship tracks or in time (Jones et al., 2012).

Biogeosciences
Main intention/focus: to produce a pCO 2 data set for various uses.To quantify the impact of modes of climate variability on pCO 2 and air-sea fluxes.The chosen approach departs from other methods through its purely statistical approach; it does not use any data sources other than pCO 2 .Documentation: Jones et al. (2015).
Contact: Steve Jones.

A2 OceanFlux-SI (ESA STSE OceanFlux Greenhouse Gases)
Method description: the in situ pCO 2 data within SOCAT are first corrected to a common satellite-derived temperature data set using an isochemical temperature dependence.This creates an in situ data set with a common SST reference.Each in situ data point is then corrected to the year 2010 by assuming a trend of 1.5 µatm yr −1 .The data are then binned into a monthly 1 × 1 • format.These monthly binned data are kriged to produce a spatially complete data set (Goddijn-Murphy et al., 2015).We finally generate an interannual time series by (1) cyclically using this climatological data set over time, (2) adding a prescribed trend of 1.5 µatm yr −1 in pCO 2 , and (3) correcting the pCO 2 values according to the difference between the climatological SST and the actual satellitederived SST at each time and location (Shutler et al., 2015).
We use here the original data set (not filtered based on the uncertainty).
Main intention/focus: produce a spatially complete monthly climatology of pCO 2 data for 2010 that uses a consistent temperature data set which is valid at a consistent depth in the water.Documentation: Goddijn-Murphy et al. (2015, monthly climatology), Shutler et al. (2015, interannual variations).

A3 Jena-MLS (data-driven mixed-layer scheme)
Method description: the mixed-layer scheme is a data-driven interpolation scheme, primarily based on pCO 2 observa-tions but also compatible with the dynamics of mixedlayer carbon content.Firstly, the sea-air CO 2 fluxes and the pCO 2 field are linked to the spatio-temporal field of oceaninternal carbon sources/sinks through parametrizations of sea-air gas exchange, solubility, and carbonate chemistry, as well as a budget equation for mixed-layer dissolved inorganic carbon (DIC).Then, the ocean-internal carbon sources/sinks are adjusted to optimally fit the pCO 2 field to the pCO 2 observations (in the present version oc_v1.3:SOCATv3, Bakker et al., 2015).Spatio-temporal interpolation is achieved by Bayesian a priori smoothness constraints with prescribed spatial and temporal de-correlation scales; temporal interpolation also results from the inherent relaxation timescales of the mixed-layer carbon budget.Though the process parametrizations are driven by SST, wind speed, mixed-layer depth (MLD) climatology, alkalinity climatology, and some auxiliary variables, this external variability only determines features not constrained by the pCO 2 observations (e.g.day-to-day variations, or variability in datavoid areas/periods), while the estimated pCO 2 field in wellconstrained areas/periods is only determined by the observed signals (no regression against drivers).Main intention/focus: global CO 2 flux field product primarily based on observations only, with a focus on flux variability, also to be applied as an ocean prior in atmospheric CO 2 inversion (in particular the Jena inversion, Rödenbeck, 2005).The mixed-layer scheme has been chosen because it can be extended to link carbon variability to further observables (mixed-layer PO 4 , atmospheric O 2 ), to use these as additional independent data constraints.

A4 CU-SCSE (Surface Carbon State Estimate)
Method description: the Surface Carbon State Estimate (SCSE v1.0, Jacobson et al., 2015) is a Kalman filter interpolation scheme for mapping pCO 2 over the global ocean during the entire period for which SOCAT point observations are available.It is designed to provide a statistically wellcharacterized prior estimate to an atmospheric CO 2 analysis like CarbonTracker.SCSE tracks the time-varying magnitudes of a set of basis functions, determined as an optimal difference from a reference state composed of the Takahashi et al. ( 2009) pCO 2 climatology for the year 2000 plus a 1.5 µatm yr −1 global trend.Uncertainties are explicitly characterized by a full-rank posterior covariance matrix, which can then be used to produce realistic error estimates for arbitrary spatial domains.SCSE is a gridded estimation scheme that tracks pCO 2 for each 1 • × 1 • grid cell, but its effective spatial resolution is controlled by the number of basis functions used within each of 10 defined ocean basins.The number of basis functions used within each basin varies with Biogeosciences, 12, 7251-7278, 2015 www.biogeosciences.net/12/7251/2015/time and is determined by the number of available observations.This is intended to allow higher resolution at times and places where there are more pCO 2 measurements.The basis functions include empirical orthogonal functions (EOFs) of pCO 2 from a set of CMIP5 ocean carbon cycle simulations, intended to represent the within-and across-model variations of climatology, trends, and variability on interannual to decadal timescales.They are assigned widely different relaxation timescales (3 months-5 years) as determined by the EOF analysis.Documentation: Jacobson et al. (2015).

A5 AOML-EMP (diagnostic model using empirical relationships)
Method description: AOML-EMP uses empirical relationships between surface-ocean pCO 2 and SST, trained based on sub-annual variations in the Takahashi pCO 2 climatology and the associated climatological SST values.These relationships are then applied to interannually varying SST (using the NOAA optimal interpolation SST product -www.ncdc.noaa.gov/oisst).The original analysis of Park et al. (2010a) does not implicitly include the effect of rising atmospheric CO 2 levels.In the modified Park et al. analysis presented in Le Quéré et al. (2015), the effect of increasing atmospheric CO 2 on the surface ocean is simulated by applying the output of the "CO 2only" run of the NCAR CCSM-3 model (National Center for Atmospheric Research's Community Climate System Model Version 3) to each grid cell over the time period.The subdecadal variability is the same for each approach as they are based on the same pCO 2 mapping.The decadal trend of CO 2 flux calculated from the original AOML-EMP approach shows a slight decrease in uptake, while the modified approach shows an increase in uptake that is attributed to a negative feedback in CO 2 uptake due to ocean warming that is overwhelmed by increased anthropogenic CO 2 uptake.
Main intention/focus: a data-driven global CO 2 flux product.

A7 JMA-MLR
Method description: the global ocean was divided into 44 sub-regions based on the features of observed pCO 2 and SST/SSS/Chl a variability and then optimal equations for estimating pCO 2 in the sub-regions were derived from multiple regressions using SST, SSS and Chl a as independent variables.Rather than using time as an independent variable, secular trends of pCO 2 (for wider biomes than the sub-regions) were evaluated separately from multiple regressions, subtracted from the data, and re-added to the pCO 2 map.Observed pCO 2 , SST and SSS in SOCATv2 and satellite Chl a (SeaWiFS and MODIS/Aqua: http:// oceancolor.gsfc.nasa.gov;before 1997, the climatology of satellite chlorophyll a data are used) are used to derive equations and analytical SST (MGDSST: Kurihara et al., 2006) and SSS (MOVE/MRI.COM-G: Usui et al., 2006), and the same Chl a data mentioned above are used to reconstruct the pCO 2 fields.
Main intention/focus: to map the global pCO 2 and CO 2 flux fields based on surface observation data and evaluate the interannual variability and long-term trend of global ocean CO 2 uptake.The merits of using simple multiple regression analysis for estimating pCO 2 include its possibility to give oceanographic explanations for the pCO 2 variability.Documentation: Iida et al. (2015, method description and trend analysis).

A8 UNSW-SOMLO (self-organizing multiple-linear output)
Method description: in this approach we couple a neural network clustering algorithm with a multiple linear regression (MLR) to diagnose monthly ocean surface pCO 2 distributions from 1998 through to 2011.The algorithm first captures larger-scale ocean dynamics by a data-based clustering of the grid cells into "biogeochemical fingerprints" using a self-organizing map (SOM).The SOM approach utilizes the SOCATv2 gridded pCO 2 product along with co-located SST, SSS, Chl a, MLD, and geographical information (n vector) to iteratively cluster the data set into a set of 196 neurons (the spatial domains of which we refer to as biogeochemical fingerprints).Within each neuron, MLRs are then derived between pCO 2 and the optimal set of sea-surface temperature/salinity/Chl a, MLD, and atmospheric xCO 2 .Thus, each MLR can be thought of as a local-scale optimizer that follows the global non-linear optimization analysis performed by the SOM.To predict pCO 2 using any independent set of driver data, a similarity measure is first used to determine  Takahashi et al. (2009) sampled at the points where there are measurements).This step recreates a 2-D monthly climatology of pCO 2 that is similar to the one of Takahashi et al. (2009), but also different as the interpolation is based on the explanatory variables.A second step uses the raw pCO 2 data (LDEOv1.0,Takahashi et al., 2007) to adjust the interannual variability of pCO 2 over the period 1989 to 2009.A moving assimilation window is used.Input variables and pCO 2 data were previously gridded at monthly temporal and 2 • × 2 • spatial resolutions.Note that most of the coastal ocean pCO 2 data have been filtered out.
Main intention/focus: produce global CO 2 sea-air flux maps over the past decades to be coupled in the Carbon Cycle Data Assimilation System developed at LSCE within the CARBONES project.

A11 NIES-SOM
Method description: self-organizing map with linear increasing trend with time.

A12 NIES-NN (feed-forward neural network)
Method description: we first estimated the global trend of pCO 2 using the method of Zeng et al. (2014) and used this trend to normalize the pCO 2 data to the reference year 2000.
We then modelled the spatial and seasonal variations in the reference year using a feed-forward neural network (Zeng et al., 2015b).The driver variables include SST, SSS, Chl a, latitude, longitude, and month.For training, climatologies of the driver data are used.For prediction, we use time-variant SST (it would be ideal to use time-variant SSS and Chl a as well, but no such data are available in certain modelled periods).Due to the use of climatologies of the driver data and the normalized pCO 2 to train the neural network, the predicted pCO 2 does not yet contain a trend; therefore, the trend estimated in the first step is re-added to the network output.We use all data from SOCATv2 that fulfill the selection criteria elevation < −500m, ice cover < 50 %, SSS > 25, and SST > −10 Method description: the offline OTTM is run with physical data from GFDL coupled ocean-atmospheric re-analysis version-2 data for the period of 1980-2010 (Delworth et al., 2006;Gnanadesikan et al., 2006).The necessary input data used from the re-analysis are as follows: the time-dependent 3-D currents, hydrography and surface 2-D variables such as MLD, heat fluxes, water fluxes and sea surface height.
The physical part of OTTM calculates the evolution of tracers in the global ocean (Valsala et al., 2008).The biological model is adapted from McKinley et al. (2004).The export production in the surface euphotic zone (0-140 m) is calculated using prescribed monthly climatological phosphate and light, scaled by a spatially varying α parameter which accounts for maximum export rate and for those processes which are not accounted for by the phosphate and light limitation model.The surface ocean chemistry model is taken from the OCMIP-II abiotic model (Orr et al., 1999).The physical-biogeochemical model is used to simulate the surface ocean pCO 2 and air-sea CO 2 fluxes.The surface ocean pCO 2 in the model is constrained by a variational assimilation method in which a conservative adjoint of data-model misfit of pCO 2 (using the pCO 2 climatology and LDEOv1.0 point data; Takahashi et al., 2007) is tracked backward in time in the 3-D ocean over an iteration window of 2 months.At each iteration, the forward model corrects the initial and boundary condition of DIC (dissolved inorganic carbon) according to the weighted adjoints.The iterations are truncated when the mismatch falls below a minimum value of 10 % of its initial value (see Valsala and Maksyutov, 2010).Documentation: Valsala and Maksyutov (2010).2, panels roughly in geographical arrangement).Vertical scales span the same range for all biomes (100 µatm), but some vertical shift has been chosen according to the mean spatial pCO 2 pattern.Line styles indicate the relative monthly mismatch: R month < 30 % (thick), 30-60 % (medium), 60-75 % (thin), above 75 % (dashed); the legend reflects "global".In some biomes, lines of certain mapping methods with higher mismatches have been clipped (rather than enlarging the vertical scale), in order to maintain clarity.Biomes 1 (North Pacific Ice) and 8 (North Atlantic Ice) have been omitted due to extremely sparse data coverage.Line styles indicate the relative IAV mismatch: R iav < 30 % (thick), 30-60 % (medium), 60-75 % (thin), above 75 % (dashed); the legend reflects "global".Vertical scales span the same range for all biomes (100 µatm), but some vertical shift has been chosen according to the mean spatial pCO 2 pattern.In some biomes, lines of certain mapping methods with higher mismatches have been clipped (rather than enlarging the vertical scale), in order to maintain clarity.Biomes 1 (North Pacific Ice) and 8 (North Atlantic Ice) have been omitted due to extremely sparse data coverage.(Note that these values are only roughly indicative of the strength of data constraint, which not only depends on the number of data, but also strongly on their distribution within the biome.Also, the magnitudes cannot be compared between biomes, because they have differently many pixels and the pixel size depends on latitude.Further note that several methods use LDEO or other SOCAT versions, and thus may be constrained more strongly or more weakly in certain periods.i for each submission (cases not fully covering the two trend periods have been omitted to avoid inconsistencies).Error bars only reflect the uncertainty of the linear fit due to interannual variations (calculated assuming consecutive years to be statistically independent).Despite the very short periods, a more negative trend in the later period is a significant and consistent feature.The solid black horizontal lines give the weighted mean trends for the two periods, where submissions have been weighted both according to R iav and to the uncertainty of the linear fit.

Figure 1 .
Figure 1.Statistical interpolation methods essentially only use the pCO 2 data themselves, filling spatio-temporal gaps by assuming a statistical relation to neighbouring data points.In well-constrained areas/periods, they closely follow the signals contained in the data, while in areas/periods far from neighbouring data points, they remain essentially unconstrained.Regression methods establish a quantitative relation between pCO 2 and a set of external variables assumed to capture the major modes of spatio-temporal variability.Adjustable degrees of freedom are constant in time and within certain spatial regions, such that data gaps can be filled according to the spatio-temporal structure in the external variables; however, variability not contained in any of the chosen external data sets cannot be reproduced.Non-linear regression methods (feed-forward neural networks, self-organizing maps) essentially do not impose any structure onto this relation between pCO 2 and the drivers.(Multi-)linear regression imposes a linear relationship, thereby restricting the type of responses but ensuring a unique and mathematically well-defined solution.Finally, knowledge of biogeochemical processes can be brought to bear by regression of pCO 2 against fields simulated by a biogeochemical process model, or by tuning initial conditions or parameters in such a model simulation to match the observations.However, this relies heavily on the structure of the process simulation to be correct.

Figure 2 .
Figure 2. Map of biomes (Fay and McKinley, 2014) used for time series comparison.See Table5for biome names.

Figure 3 .
Figure 3. pCO 2 time series from all 14 presented mapping methods averaged over the North Atlantic Subtropical Permanently Stratified biome of Fay and McKinley (2014, illustrated by the little map).Line styles indicate the relative monthly mismatch: R month < 30 % (thick), 30-60 % (medium), 60-75 % (thin), above 75 % (dashed).(a) pCO 2 on monthly time steps for three selected years.(b) As (a), but averages only calculated over pixels with data in the SOCATv2 monthly gridded data set.(c) Mismatch: biome-average difference between the submitted pCO 2 fields and the co-located SOCATv2 monthly gridded values.

Figure 5 .
Figure 5. Interannual sea-air CO 2 flux variations in the East Pacific Equatorial biome (left) and the global ocean (right) from selected mapping methods (having relative IAV mismatch R iav < 75 % for pCO 2 averaged in the respective region).(a, c) Time series (yearly flux sum).Line styles indicate the relative IAV mismatch: R iav < 30 % (thick), 30-60 % (medium), 60-75 % (thin).The vertical dotted lines delimit the analysis period for the amplitude computation.(b, d) Amplitudes A iav i of interannual CO 2 flux variations (see Sect. 4.2) plotted against the relative IAV mismatch amplitude R iav i for each submission (cases not fully covering the analysis period have been omitted to avoid inconsistencies).The weighted mean A iav (Eq. 3) is given as a horizontal line.

Figure A1 .
Figure A1.Monthly pCO 2 variations over years 2003-2005 (arbitrarily selected) as estimated by all mapping methods, averaged over the biomes by Fay and McKinley (2014, see Fig.2, panels roughly in geographical arrangement).Vertical scales span the same range for all biomes (100 µatm), but some vertical shift has been chosen according to the mean spatial pCO 2 pattern.Line styles indicate the relative monthly mismatch: R month < 30 % (thick), 30-60 % (medium), 60-75 % (thin), above 75 % (dashed); the legend reflects "global".In some biomes, lines of certain mapping methods with higher mismatches have been clipped (rather than enlarging the vertical scale), in order to maintain clarity.Biomes 1 (North Pacific Ice) and 8 (North Atlantic Ice) have been omitted due to extremely sparse data coverage.

Figure A2 .
Figure A2.Interannual pCO 2 variations as estimated by all mapping methods, averaged over the biomes byFay and McKinley (2014, see  Fig. 2, panels roughly in geographical arrangement).Line styles indicate the relative IAV mismatch: R iav < 30 % (thick), 30-60 % (medium), 60-75 % (thin), above 75 % (dashed); the legend reflects "global".Vertical scales span the same range for all biomes (100 µatm), but some vertical shift has been chosen according to the mean spatial pCO 2 pattern.In some biomes, lines of certain mapping methods with higher mismatches have been clipped (rather than enlarging the vertical scale), in order to maintain clarity.Biomes 1 (North Pacific Ice) and 8 (North Atlantic Ice) have been omitted due to extremely sparse data coverage.

Figure A3 .Figure A4 .Figure A5 .
Figure A3.Mismatch between the interannual pCO 2 variations as estimated by all mapping methods and the SOCATv2 monthly gridded values (biome/yearly averages of the map-data difference sampled at the location/time of the comparison data, Sect.3.5.1).

Figure A6 .Figure A7 .
Figure A6.Valid domain for each mapping method (coloured area).The colour gives the number of valid months within the period of the method; a number less than the maximum (dark red) indicates either (1) a fractional sea mask along coasts (Jena-MLS), (2) seasonally invalid months due to unavailable chlorophyll a input data (JMA-MLR, ETH-SOMFFN, NIES-SOM), or (3) occasional invalid months due to missing SST input (OceanFlux-SI) or numerical reasons (UEx-MLR).

Table 1 .
General information on the mapping methods.

Table 2 .
Original domains and grid resolutions of the products.

Table 3 .
Specifications of the mapping methods with respect to the constraints and assumptions used (see Appendix A for details and references).

Table 4 .
Information content about various modes of variability as implemented by the individual mapping methods.Modes labelled as "EST."(estimated) are considered to reflect data-based information, as they either are directly estimated by time-dependent adjustments, or are regressed against drivers through multiple adjustable degrees of freedom.The trend is considered estimated if the interannual degrees of freedom allow a data-based trend to establish or if pCO 2 is explicitly regressed against a rising term (time or atmospheric CO 2 ).

www.biogeosciences.net/12/7251/2015/ Biogeosciences, 12, 7251-7278, 2015 which
neuron best represents the driver data values, then the pCO 2 value is predicted using the regression parameters established with training data of that neuron.We call this approach SOMLO: self-organizing multiple linear output.Main intention/focus: to diagnose monthly ocean surface pCO 2 distributions and air-sea CO 2 fluxes from 1998 through to 2011, and to advance our understanding of seasonal to interannual variability.

OTTM (ocean tracer transport model with variational assimilation of surface ocean pCO 2 )
(Takahashi et al., 2012)tion details of the model can be found inZeng et al. (2015a).soas to optimally fit the pCO 2 observations.The data product that is inverted is LDEOv2010(Takahashi et al., 2012).