Projected 21st century decrease in marine productivity: a multi-model analysis

. Changes in marine net primary productivity (PP) and export of particulate organic carbon (EP) are projected over the 21st century with four global coupled carbon cycle-climate models. These include representations of marine ecosystems and the carbon cycle of different structure and complexity. All four models show a decrease in global mean PP and EP between 2 and 20% by 2100 relative to preindustrial conditions, for the SRES A2 emission scenario. Two different regimes for productivity changes are consistently identiﬁed in all models. The ﬁrst chain of mechanisms is dominant in the low- and mid-latitude ocean and in the North Atlantic: reduced input of macro-nutrients into the euphotic zone related to enhanced stratiﬁcation, reduced mixed layer depth, and slowed circulation causes a decrease in macronutrient concentrations and in PP and EP. The second regime is projected for parts of the Southern Ocean: an alleviation of light and/or temperature limitation leads to an increase in PP and EP as productivity is fueled by a sustained nutrient input. A region of disagreement among the models is the Arctic, where three models project an increase in PP while one model projects a Projected changes in seasonal and interannual variability are in most regions. skill metrics are to multi-model mean ing observation-based estimates compared to a simple multi-model average. Model results are compared to recent productivity projections with three different algorithms, usually applied to infer net primary production from satellite observations.

Published by Copernicus Publications on behalf of the European Geosciences Union.
The goal of this study is to provide a multi-model estimate of long-term trends in net primary productivity (PP) and export of particulate organic material (EP) using global warming simulations from four fully coupled atmosphereocean general circulation models and to identify the mechanisms behind these changes. These are the IPSL-CM4-LOOP model from the Institut Pierre Simon Laplace (IPSL), the COSMOS Earth System Model from the Max-Planck Institute for Meteorology (MPIM), and two versions of the Community Climate System Model (CSM1.4-carbon and CCSM3-BEC) from the National Center for Atmospheric Research. In this paper these models are referred to as IPSL, MPIM, CSM1.4, and CCSM3, respectively. The focus of the analysis is on how decadal-to-century scale changes in physical factors and nutrient availability affect global and regional PP and EP. The motivation is to provide an account on the performance of current climate-ecosystem models under global warming and to derive a best estimate of changes in productivity using regional model skill metrics. Our interest is further fueled by the contradicting projections for global PP from some "mechanistic" models, as used here, and a recent statistical model approach (Sarmiento et al., 2004).
A general finding across the hierarchy of mechanistic models is that global EP decreases in 21st century global warming simulations (Klepper and De Haan, 1995;Maier-Reimer et al., 1996;Joos et al., 1999;Matear and Hirst, 1999;Plattner et al., 2001;Bopp et al., 2001;Fung et al., 2005;Frölicher et al., 2009). Increased stratification and a slowed thermohaline circulation in response to surface warming and freshening cause a decrease in the delivery of nutrients to the surface. As a consequence, global EP and in some models also PP is reduced. In these models, the marine biological cycle is closed in the sense that nutrient uptake by phytoplankton, export of organic material into the thermocline, remineralization of organic material and transport of inorganic nutrients by the circulation is represented. In the simpler models, EP (or some approximation of PP) is tied to the availability of nutrients (such as phosphate or iron), light and temperature without considering food web dynamics, whereas in the more complex models the growth of phyto-and zooplankton, nitrogen fixation, and food web interactions and floristic shifts are explicitly taken into account, albeit in a simplified way. Large scale biogeochemical models often lack an explicit representation of the microbial loop. The energy and nutrient flows initiated by bacterial consumption of dissolved organic matter and grazing by bacterivores (Azam et al., 1983) are represented by a decay function for dissolved organic matter. The decay of dissolved organic matter releases nutrients which are in turn available for plankton consumption. Globally, the change in nutrient supply is the dominant mechanism for EP and PP changes in 21st century global warming simulations, whereas other factors such as changes in light availability and the growing season length due to sea ice retreat, altered oceanic mixing conditions, and cloud characteristics, or the direct impact of elevated temperature on physiology considerably affect regional responses in productivity (Bopp et al., 2001). A decrease in global PP and new production by 5 to 8% is also projected in an off-line simulation with an ecosystem model (Moore et al., 2002) driven by the climate induced changes in ocean physics from an AOGCM simulation of the SRES A1 mid-range emission scenario (Boyd and Doney, 2002); the decrease is primarily attributed to the prescribed reduction in subsurface nutrients. In contrast, Sarmiento et al. (2004) projects an increase in global PP by 0.7 to 8.1% using an empirical model approach. We also note that  find a strong increase in PP in 21st century CO 2 scenarios albeit new production and EP decrease. The increase in PP in their study results from an exponential dependency of phytoplankton growth rates on temperature (Eppley, 1972). Schneider et al. (2008) present results for three (IPSL, MPIM, and CSM1.4) of the four Earth System models used in this study. They provide detailed information on the performance of these three models under current climate conditions and compare modeled physical (temperature, salinity, mixed layer depth, meridional overturning, ENSO variability) and biological (PP and EP, chlorophyll concentration) results with observation-based estimates. Of particular interest is the model performance with respect to seasonal and interannual variability as changes on these time scales may be linked to the century scale changes examined here. The models capture the general distribution of higher absolute PP and higher seasonal variability in the intermediate to high latitudes, though all models overestimate seasonal variability in intermediate southern latitudes. Interannual variability is largely controlled by the permanently stratified low-latitude ocean in all three models consistent with satellite data . However, the MPIM model strongly overestimates the amplitude and frequency of interannual PP variations, while the variability amplitude is slightly too low in the CSM1.4 model. Only the IPSL model is able to capture the correlation between observationbased PP, sea surface temperature and stratification in the low-latitude, stratified ocean. The MPIM model, and to a lesser degree, the CSM1.4 model, suffer from a too strong iron limitation compared to the real ocean. In the MPIM model, overall iron limitation is caused by the combination of low aeolian deposition and, more importantly, a high halfsaturation value for iron. In the CSM1.4 model, iron appears to be too strongly scavenged, especially in the subtropical Pacific, also resulting in too strong iron limitation (Schneider et al., 2008). It remains difficult for any model to represent the iron cycle with its intricate coupling between physical transport, spatial and temporal varying iron sources by dust deposition (e.g. Mahowald et al., 2006) and sediments (e.g. de Baar et al., 1995;Dulaiova et al., 2009), iron sinks by particle scavenging, complexation by organic ligands (e.g. Parekh et al., 2008) and ecosystem and remineralization processes (Boyd et al., 2007).
The skill of the ocean component of the CCSM3 model in simulating PP and related variables has been assessed by Doney et al. (2009a). In that study, the model was forced with physical climate forcing from atmospheric reanalysis and satellite data products and atmospheric dust deposition. The results were then evaluated using data-based skill metrics. It was found that the model surface chlorophyll tend to be too high in the subtropical gyres and too low in the subpolar gyres. This error pattern may result from a too weak grazing by zooplankton relative to PP in the picoplankton dominated subtropics and a too strong grazing in bloom environments. Further, their simulation shows excess surface macronutrients in the tropical Pacific, which is also true for the simulation in the present study. This is likely a result of a combination of physical circulation errors and too much iron scavenging. PP is found to be higher than observed in tropical and subtropical Pacific, suggesting that errors may also arise from other aspects of the biological cycling (e.g., export flux, subsurface remineralization; Doney et al., 2009a).
A challenge for any multi-model analysis is how to extract and distill the information contained in the individual models in a quantitative way. Ideally, the strengths of each individual model would be combined while weaknesses and failures would be removed to obtain an optimal multi-model mean.
Here, we use regional weights to compute multi-model mean fields in PP and EP changes.
In this paper we analyze centennial-scale changes in PP and EP under anthropogenic climate warming. Unlike earlier studies, we make use of four interactively coupled global carbon cycle-climate models that include iron cycling and representations of the marine biogeochemistry of different complexities. The use of a multi-model ensemble increases the robustness of the results and allows us to explore uncertainties. The models are forced with prescribed CO 2 emissions from reconstructions (1860-2000 AD) and a high emission scenario, SRES A2 (2000-2100 AD). In the next section, models and experimental setup are described. In the result section, we first present projections for marine PP. Then, we investigate underlying physical and biogeochemical mechanisms, quantify model sensitivities, and also address changes in the seasonal cycle. Regional model skill metrics are used to compute multi-model mean changes. In the discussion section, results of the mechanistic models are compared with those of Sarmiento et al. (2004) and discussed in the light of earlier studies. Throughout this paper, the variables PP and EP are used to represent net primary productivity and export of particulate organic carbon (POC), respectively.

Models
All models used in this study are fully coupled 3-D atmosphere-ocean climate models that contributed to the IPCC Fourth Assessment Report (Solomon et al., 2007;Meehl et al., 2007). In the case of the CCSM3 model a version without carbon cycle was used for the IPCC report. In this study, all model versions include carbon cycle modules for the terrestrial and oceanic components .

IPSL
The IPSL-CM4-LOOP (IPSL) model consists of the Laboratoire de Météorologie Dynamique atmospheric model (LMDZ-4) with a horizontal resolution of about 3 • × 3 • and 19 vertical levels (Hourdin et al., 2006), coupled to the OPA-8 ocean model with a horizontal resolution of 2 • × 2 • · cosφ and 31 vertical levels and the LIM sea ice model (Madec et al., 1998). The terrestrial biosphere is represented by the global vegetation model ORCHIDEE (Krinner et al., 2005) and the marine carbon cycle is simulated by the PISCES model . PISCES simulates the cycling of carbon, oxygen, and the major nutrients determining phytoplankton growth (PO 3− 4 , NO − 3 , NH + 4 , Si, Fe). The model has two phytoplankton size classes (small and large), representing nanophytoplankton and diatoms, as well as two zooplankton size classes (small and large), representing microzooplankton and mesozooplankton. Phytoplankton growth is limited by the availability of nutrients and light. The nanophytoplankton and diatom growth rates as well as the grazing rate of microzooplankton are temperature dependent and increase by a factor of 10 over the temperature range from −2 • C to 34 • C. The temperature sensitivity of the mesozooplankton grazing rate is slightly higher (Q 10 = 2.14). For all species the C:N:P ratios are assumed constant (122:16:1; Takahashi et al., 1985), while the internal ratios of Fe:C, Chl:C, and Si:C of phytoplankton are predicted by the model. Iron is supplied to the ocean by aeolian dust deposition and from a sediment iron source. Iron is also added at the surface if the iron concentration falls below a lower limit of 0.01 nM. Iron is taken up by the plankton cells and released during remineralization of organic matter. Scavenging of iron onto particles is the sink for iron to balance external input. There are three non-living components of organic carbon in the model: semi-labile dissolved organic carbon (DOC), with a lifetime ranging from a few days to several years as a function of bacterial biomass and activity, and large and small detrital particles, which are fueled by mortality, aggregation, fecal pellet production and grazing. Small detrital particles sink through the water column with a constant sinking speed of 3 m day −1 , while for large particles the sinking speed increases with depth from a value of 50 m day −1 at the depth of the mixed layer, increasing to a maximum sinking speed of 425 m day −1 at 5000 m depth. For a more detailed description of the PISCES model see Aumont and Bopp (2006) and Gehlen et al. (2006). Further details and results from the fully coupled model simulation of the IPSL-CM4-LOOP model are given in Friedlingstein et al. (2006).

MPIM
The Earth System Model employed at the Max-Planck-Institut für Meteorologie (MPIM) consists of the ECHAM5  atmospheric model of 31 vertical levels with the embedded JSBACH terrestrial biosphere model and the MPIOM physical ocean model, which includes a sea ice model (Marsland et al., 2003) and the HAMOCC5.1 marine biogeochemistry model (Maier-Reimer, 1993;Six and Maier-Reimer, 1996;Maier-Reimer et al., 2005). The coupling of the marine and atmospheric model components, and in particular the carbon cycles, is achieved by using the OA-SIS coupler.
HAMOCC5.1 is implemented into the MPIOM physical ocean model configuration using a curvilinear coordinate system with a 1.5 • nominal resolution where the North Pole is placed over Greenland, thus providing relatively high horizontal resolution in the Nordic Seas. The vertical resolution is 40 layers, with higher resolution in the upper part of the water column (10 m at the surface to 13 m at 90 m). A detailed description of HAMOCC5.1 can be found in Maier-Reimer et al. (2005), while here only the main features relevant for the described experiments and analyses will be outlined. The marine biogeochemical model HAMOCC5.1 is designed to address large-scale, long-term features of the marine carbon cycle, rather than to give a complete description of the marine ecosystem. Consequently, HAMOCC5.1 is a NPZD model with one phytoplankton group (implicitly divided into calcite (coccolithophorids) and opal (diatoms) producers and flagellates) and one zooplankton species and particulate and dissolved dead organic carbon pools. The carbonate chemistry is identical to the one described in Maier-Reimer (1993).
PP depends on the availability of light (I ) and nutrients. The local light supply is calculated from the temporally and spatially varying solar radiation at the sea surface, I (0,t), as provided by the OGCM. Below the surface, light intensity is reduced due to attenuation by sea water (k w ) and chlorophyll (k c ) using a constant conversion factor for C:Chl, R C:Chl : PP depends linearly on the availability of light, without saturation of growth rates for stronger irradiance (I ). The growth rate J (I (z,t)), is calculated as J (I ) = α PHY I (z,t), where α PHY is the slope of the production vs. light intensity curve. J is then multiplied by the nutrient limitation factor, which is calculated from a simple Monod function, limited by the least available nutrient (either phosphate, nitrate, or iron) to derive PP. Particulate organic matter (POM), also termed detritus, is formed from dead phytoplankton and zooplankton, and zooplankton fecal pellets. POM production is taken to be proportional to the phytoplankton and zooplankton concentrations through constant mortality rates of plankton, and to the zooplankton grazing rate. POM sinks out of the euphotic zone and is remineralized at depth. Furthermore, dissolved organic matter is produced by phytoplankton exudation and zooplankton excretion.
In the model version used in this study, all biological production rates (photosynthesis, mortality, grazing etc.) were temperature-independent, assuming that phytoplankton acclimate to local conditions. A constant climatology of global dust deposition fields from Stier et al. (2005) was used as source function of bioavailable iron. Removal of dissolved iron occurs through biological uptake and export, and by scavenging. Scavenging of iron is described as a relaxation to the deep-ocean iron concentration of 0.6 nM with a time scale of 200 years where the local concentration exceeds this value. The Fe:C ratio, also used to calculate the halfsaturation value for iron, was set to a value of 5×10 −6 (John son et al., 1997). With regard to the later discussions of results it should be noted here that the dust field of Stier et al. (2005) and the applied Fe:C ratio cause a too strong iron limitation of the model. Both using the Mahowald et al. (2006) dust fields or using an Fe:C ratio of 3×10 −6 , which would be at the low end of the Johnson et al. (1997) estimates, would have avoided this. Unfortunately, the simulations with the coupled model were so computationally expensive that they could not be repeated until today, and the issue was only discovered when evaluating the experiment.
Export of detritus is simulated using prescribed settling velocities for opal (30 m day −1 ), calcite shells (30 m day −1 ) and organic carbon (10 m day −1 ). Remineralization of organic matter depends on the availability of oxygen. In anoxic regions, remineralization occurs via denitrification. The model also includes cyanobacteria that take up nitrogen from the atmosphere if the local N:P ratio is below the Redfield ratio as a result of denitrification, and transform it directly into nitrate.
HAMOCC5.1 also includes an interactive sediment module. This component simulates pore water chemistry, the solid sediment fraction and interactions between the sediment and the oceanic bottom layer as well as between solid sediment and pore water constituents.

CSM1.4
The physical core of the NCAR CSM1.4 carbon climate model Fung et al., 2005) is a modified version of the NCAR CSM1.4 coupled physical model, consisting of ocean, atmosphere, land and sea ice components integrated via a flux coupler without flux adjustments (Boville et al., 2001;Boville and Gent, 1998). The atmospheric model CCM3 is run with a horizontal resolution of 3.75 • and 18 levels in the vertical (Kiehl et al., 1998). The ocean model is the NCAR CSM Ocean Model (NCOM) with 25 levels in the vertical and a resolution of 3.6 • in longitude and 0.8 • to 1.8 • in latitude   Randerson et al., 1997), and a derivate of the OCMIP-2 (Ocean Carbon-Cycle Model Intercomparison Project Phase 2) ocean biogeochemistry model (Najjar et al., 2007). In the ocean model, the biological source-sink term has been changed from a nutrient restoring formulation to a prognostic formulation inspired by Maier-Reimer (1993). Biological productivity is modulated by temperature (T ), surface solar irradiance (I ), mixed layer depth (MLD), and macro-and micro-nutrients (PO 3− 4 , and iron): where κ PO 4 = 0.05µmol/l, κ Fe = 0.03 nmol/l, κ I = 20 W/m 2 , r Fe:P = 5.85 × 10 −4 , τ = 15 days, and z c = 75 m. This empirical parameterization is intended to model the large-scale nutrient utilization by marine ecosystems. For example, the temperature function, together with iron limitation, forces a low productivity and nutrient utilization in water that is colder than about 2 • C. On the other hand productivity depends only weakly on temperature in warmer waters. The temperature factor increases by less than two for a temperature increase from 4 • C to 34 • C; this may be compared to an increase by a factor of seven in the IPSL model and no temperature-dependent growth rates in the MPIM model. Following the OCMIP-2 protocols (Najjar et al., 2007) total biological productivity is partitioned 1/3 into sinking POC flux, here taken to be equivalent to export productivity (EP), and 2/3 into the formation of dissolved or suspended organic matter, where much of the latter is remineralized within the model euphotic zone. Total productivity thus contains both new and regenerated production, though the regenerated contribution is probably lower than in the real ocean, as only the turnover of semi-labile dissolved organic matter (DOM) with a decay time scale of half a year is considered. CSM1.4 net primary productivity (PP) thus represents, rather, the carbon flux associated with net nutrient uptake and is not strictly equivalent to net primary production as measured by 14 C methods. It appears to be a reasonable proxy for the time and space variability of PP if somewhat underestimating the absolute magnitude (Schneider et al., 2008). For reasons of simplicity, net nutrient uptake times the C:P ratio of 117 (Anderson and Sarmiento, 1994) is considered here as PP, even though it is not exactly the same. The ocean biogeochemical model includes the main processes of the organic and inorganic carbon cycle within the ocean and air-sea CO 2 flux. A parametrization of the marine iron cycle  includes atmospheric dust deposition/iron dissolution, biological uptake, vertical particle transport and scavenging.
The CSM1.4-carbon source code is available online and described in detail in Doney et al. (2006).

CCSM3
The CCSM3 Biogeochemical Elemental Cycling (BEC) model includes several phytoplankton functional groups, one zooplankton group, semi-labile dissolved organic matter, and sinking particulates (Moore et al., 2004). Model-data skill metrics for the simulated marine ecosystem in uncoupled ocean experiments are reported in Doney et al. (2009a). The BEC includes explicit cycling of C, N, P, Fe, Si, O, and alkalinity. Iron has external sources from dust deposition and marine sediments, and the scavenging of iron onto particles balances these sources with 10% of scavenged iron presumed lost to the sediments (Moore and Braucher, 2008). Phytoplankton functional groups include diatoms, diazotrophs, picoplankton, and coccolithophores. The export ratio is largely a function of phytoplankton community composition with diatom production being exported more efficiently than production by small phytoplankton. Phytoplankton growth rates and zooplankton grazing rates are modified by a temperature function that includes a Q 10 factor of 2.0. Thus, maximum growth rate would change by a factor of 8 for a temperature increase from 4 • C to 34 • C. Phytoplankton growth rates are also a function of nutrient and light limitation and these factors are multiplicative (Moore et al., 2004). Phytoplankton Fe/C, Chl/C, and Si/C ratios adjust dynamically to ambient nutrient and light, while the C/N/P ratios are fixed within each group (Moore et al., 2004). The CCSM3 ocean circulation model is a coarse resolution version of the parallel ocean program (POP) model with longitudinal resolution of 3.6 degrees and a variable latitudinal resolution from 1-2 degrees. There are 25 vertical levels with eight levels in the upper 103 m (Smith and Gent, 2004;Yeager et al., 2006).

Experiments and satellite-based productivity estimates
The models are forced by anthropogenic CO 2 emissions due to fossil fuel burning and land-use changes as reconstructed for the industrial period and following the SRES A2 emission scenario after 2000 AD. The CSM1.4, CCSM3, and MPIM models also include CH 4 and CFCs. N 2 O, volcanic emissions, and changes in solar radiation are additionally taken into account by the CSM1.4 and CCSM3 models as described by Frölicher et al. (2009). Dust deposition fields were kept at a constant climatology in all experiments. All models were integrated for more than one thousand years for spin up (Schneider et al., 2008;Thornton et al., 2009). For analysis, all variables have been interpolated onto a common 1 • ×1 • grid using a Gaussian weighted average of the data points within a radius of 4 • with a mapping scale of 2 • . Control simulations in which CO 2 emissions are set to zero and other forcings are set to constant preindustrial levels are Table 1. Simulated global annual primary production (PP) and POC export (EP) for the four models IPSL, MPIM, CSM1.4, and CCSM3 under SRES A2. PP values are also given for weighted means of the four models derived from regional skill scores (PP S ), mean square errors (PP E ), and global skill scores (PP S glob ), as well as for the arithmetic average (PP ave ). Global skill scores (S glob ) and root mean square errors (RMSE) indicate the ability of the individual models and the multi-model means to reproduce the satellite-based estimates of PP (average 1998-2005, see main text for details). Values are averaged over the periods 1860-1869 (1865), 1985-2004 (2000), and 2090-2099 (2095 used to remove century-scale model drifts for each grid point and for each calendar month (Frölicher et al., 2009). Affected are the three-dimensional distribution of temperature, salinity, and nutrient concentrations in the IPSL and CSM1.4 models, as well as PP and EP in IPSL. For these variables, detrended values from the scenario simulations are used for analysis. We note that trends in surface values are small in the CSM1.4.
As a point of reference and following Schneider et al. (2008), we utilize throughout this study satellite-based estimates obtained with the Behrenfeld algorithm (VGPM; Behrenfeld and Falkowski, 1997b;Behrenfeld et al., 2006) for data-model comparison and to compute skill-score weighted multi-model averages. The satellite-derived estimates have uncertainties. For example, Carr et al. (2006) report that global PP estimates from twenty-four ocean-colorbased models range over a factor of two. On a more positive side, ocean-color-based models agree with respect to the spatial pattern of chlorophyll distributions and correlations among the resulting fields are typically high. Given these substantial uncertainties in satellite-based productivity data, the comparison of model results with one single satellitebased data set should be viewed as an illustrative example.

Projected annual mean net primary productivity and export production under SRES A2
We briefly discuss the magnitude and spatio-temporal patterns of net primary production (PP) in comparison with satellite-based estimates (see Schneider et al. (2008) for a more comprehensive analysis) and compare simulated eratios, i.e. the ratio of annual EP to PP, to field data compiled by Laws et al. (2000) before addressing long-term changes in PP. Global annual PP ranges between 24 Gt C yr −1 (MPIM) and 49 Gt C yr −1 (CCSM3) for modern conditions (Table 1, Fig. 1). Only the CCSM3 model lies within the satellitebased range of 35 to 70 Gt C yr −1 Carr et al., 2006). The range of the other three models is considerably lower. The very low PP in the MPIM model is likely linked to an overall too strong limitation of PP by iron (Schneider et al., 2008). This is supported by the fact that surface nitrate concentrations are largely overestimated by this model. PP in CSM1.4 represents carbon uptake associated with net nutrient uptake, rather than overall net primary productivity, and is thus underestimating real net primary production by design. There are also deficiencies in the regional representation of PP ( Fig. 2). High PP along continental margins is not adequately represented in coarse resolution models. The MPIM model underestimates PP outside the equatorial regions, and the CSM1.4 model has too low PP in the equatorial Pacific. These deficiencies are related to the iron cycle of the two models. IPSL and CCSM3 appear to underestimate PP in high northern latitudes. The CCSM3 model clearly overestimates PP in the eastern tropical Pacific. The skill of individual models to represent the satellitebased PP field is rather low with correlations between modeled and satellite-based fields of less than 0.6 ( Fig. 2b). The errors in the simulated PP fields reflect both deficiencies in the simulated physical fields and in the representation of ecosystem processes in the coupled AOGCM. Results from ocean only models with prescribed surface forcing compare typically better with observation-based estimates.We recall that the satellite-derived estimates have uncertainties ;Carr et al. (2006) report that global PP estimates from twentyfour ocean-color-based models range over a factor of two, but correlations among the resulting fields are typically high.
The e-ratio (EP:PP) is a measure of the contribution of regenerated production to total PP. The regenerated production is driven by nutrient recycling through the activities of heterotrophs, including bacteria. In the IPSL, MPIM, and CCSM3 models, the regenerational loop is described through the interaction of a limited number of different biomass pools and the production of dissolved organic matter (DOM) by plankton and the release of nutrients during DOM decay. Hence, the bacterial loop, which recycles nutrients to the food web by bacterial consumption of DOM and nutrient release by bacteria and grazing of bacteria by zooplankton, is implicitly described. In the CSM1.4 model, that features only one biomass pool, the e-ratio is fixed to 0.3. In the MPIM model, the annual-mean e-ratio shows a small spatial variability around a value of 0.2 (Fig. 3b,e). In the CCSM3 model the spatial variability is only slightly larger and the eratio ranges from 0.05 to 0.3 (Fig. 3c, f). In the IPSL, the e-ratio is low in mid-and low-latitude regions, intermediate in the Southern Ocean, North Pacific and North Atlantic and high in the Arctic (Fig. 3d). The spatial variability in the CCSM3 and IPSL models are driven mainly by phytoplankton community. Both models assume higher export for diatom production and with more diatom production at high latitudes the e-ratio increases. Following Laws et al. (2000), we compare the simulated e-ratio as a function of SST with field data from a few sites (Fig. 3a, b, c). The regression slope found for the IPSL model (−0.0097 • C −1 ) is somewhat lower than the slope of the field data (−0.0198 • C −1 ) but covers most of the data points, whereas the MPIM and CCSM3 models don't capture the observational range in the e-ratio. We note, however, that there are cold, iron-limited sites in the Southern Ocean that differ significantly from the regression line fitted to the data from the few sites selected by Laws et al. (2000). There is almost no correlation between the e-ratio and PP, consistent with observation-based estimates. Only the e-ratio in the CCSM3 model seems to be slightly biased towards higher values at locations with high PP.
Despite the deficiencies of individual models, the models as a class represent the pertinent features of the satellitebased observations such as a low PP in the oligotrophic gyres and the southern high latitudes (all models), high PP features in the North Atlantic (CSM1.4, IPSL, CCSM3), in the North Pacific (IPSL, CCSM3), around 30 • S to 50 • S (CSM1.4, IPSL, CCSM3), and in the equatorial and eastern boundary upwelling systems. Other reproduced features are the high seasonal variability in the North Atlantic and in southern intermediate latitudes (all), the low seasonal variability around the equator (CSM1.4) and in mid latitudes (all), and the correlation of temperature and stratification with PP on the interannual time scales for the low-latitude, permanently stratified ocean (IPSL) or the Nino3 region (IPSL, CSM1.4).  Behrenfeld et al., 2006;Behrenfeld and Falkowski, 1997b) and simulated by IPSL (c), MPIM (e), CSM1.4 (g), and CCSM3 (i) under preindustrial conditions (decadal mean 1860-1869). Dashed lines indicate the transects through the Atlantic and Pacific analyzed in this study. The Taylor diagram (b) shows the correspondence between model results and the satellite-based estimates (Taylor, 2001). In this diagram the polar coordinates represent the correlation coefficient R (polar angle) and the normalized standard deviation σ model /σ obs (radius). Panels d, f, h, and j show the projected changes by the end of the 21st century under SRES A2 for the four models. The changes are shown on an exponential scale and represent the difference between 2090-2099 and 1860-1869 (decadal means). This comparison with satellite-derived and in-situ estimates allows us to continue with some confidence as well as with caution to the discussion of 21st century projections.
All four models show a reduction in the globally integrated annual mean PP in the simulations from 1860 AD to 2100 AD under SRES A2 (Fig. 1, Table 1). The IPSL model shows the biggest changes. In that model PP declines by 4.6 GtC/yr by the end of this century, which is a reduction of the simulated preindustrial PP by 13%. The MPIM and CSM1.4 models show reductions of 10% (2.3 GtC/yr) and 7% (1.9 GtC/yr), respectively. The CCSM3 model, which yields the highest PP, projects the smallest reduction of 2% (1.0 GtC/yr). Despite these small changes on the global scale, the CCSM3 model shows local changes of the same order of magnitude as the other models, but these changes tend to cancel out to some extent. temperature is used to normalize PP changes with respect to climate change in order to account for the different climate sensitivities of the models (Fig. 1c). This yields a slope, i.e., the global PP decrease per • C warming, of 1.4 Gt C yr −1 • C −1 for the IPSL model, but only 0.6 Gt C yr −1 • C −1 for the MPIM and CSM1.4 models and 0.2 Gt C yr −1 • C −1 for CCSM3.
We identify a number of regions with large reductions (more than 50 mg C m −2 day −1 ) in PP (Fig. 2). These correspond to high PP areas. A large reduction in PP is found in the North Atlantic in the IPSL, CSM1.4, and CCSM3 models, around 35 • S in the Pacific in the IPSL and less pronounced in the CSM1.4, in the upwelling regions off Africa in all models and in the equatorial Pacific in the MPIM and IPSL model. These reductions are qualitatively consistent across three out of the four models with the obvious caveat that no major reductions can be expected in regions where an individual model fails to simulate a significant preindustrial PP (e.g. MPIM outside the equator, CSM1.4 in the equatorial Pacific). An exception is the moderate increase in PP simulated by the CCSM3 model in parts of the tropical Pacific. Consistent moderate increases in PP are simulated in the high latitude Southern Ocean (all models) and around Svalbard, indicating that the high PP zone in the North Atlantic is moving northward with climate warming and sea ice retreat. An increase in PP is simulated in the Pacific north of 40 • N in the IPSL, CSM1.4, and CCSM3 models. We note that sea ice extent is unrealistically high in this area in the CSM1.4 model Weatherly et al., 1998). In summary, the model results suggest that PP will be reduced in most equatorial and mid-latitude regions and in the North Atlantic, and moderately enhanced in polar regions.
Climate change might not only affect PP, but also EP, and the relative contribution of new (≈ e) and recycled (≈ 1 − e) production. In the MPIM and CCSM3 models, the e-ratio remains spatially relatively uniform and shows almost no change during the simulation (Fig. 3e, h, f, i), much like in the simpler CSM1.4 model, where the e-ratio is fixed at 0.3. Thus, the relative reduction in EP follows closely the reduction in PP in these three models. The e-ratio in the IPSL model shows distinct regional changes by the end of this century (Fig. 3e). On global average, the (PP-weighted) e-ratio declines. Correspondingly, the reduction in EP is larger than in PP; EP declines by 20% and PP by 13% over the simulation period. Turning to regional changes in the e-ratio, we find both positive and negative changes in the IPSL simulation (Fig. 3g). The attribution of simulated changes in the e-ratio is generally difficult as the set of equations describing PP and EP is complex and non-linear. Positive deviations in the e-ratio, e.g. as found in high-productive upwelling regions off South America and Africa, are small and difficult to ascribe to a forcing factor. The shifts to higher e-ratios south of Greenland and north of the Ross sea appear to be linked to the fact that these are the only two regions were a slight cooling is simulated in the IPSL model; cooler temperatures are often associated with higher nutrient concentrations which tend to favor a higher e-ratio, consistent with the observation-based slope of e-ratio versus temperature (Fig. 3a). Large negative changes in the e-ratio of up to 0.25 are simulated at the border of highly productive regions where PP decreases from moderate to low values (e.g. at the edge of the subtropical gyres, (Fig. 3d, g). The reasons for this decoupling of PP and EP in the IPSL model are a shift from diatoms and zooplankton to the smaller nanophytoplankton and the increased recycling of nutrients and carbon in the surface ocean .

Attribution of PP changes to individual drivers in the CSM1.4 model
In order to identify links between long term shifts in PP and climate change, we first focus on the NCAR CSM1.4-carbon results. This model features the simplest formulations for biological production among the four models. PP is determined by the product PP ∝ F N · F I · F T · B (Eq. 2), where the first three factors represent nutrient, light, and temperature limitation and B is a biomass proxy derived from phosphate and iron concentrations. The relative changes in these factors ( Fig. 4a-d) directly yield the relative changes in PP (Fig. 4e). Light availability is tied to the mixed layer depth and sea ice fraction in the CSM1.4 model. It increases when the mixed layer depth (MLD) exceeds 75 m. This unrealistic feature affects light limitation in the South Pacific (increased MLD/light availability) around 45 • S and in a number of grid cells in the North Atlantic. We recall that the biomass proxy corresponds to the phosphate or (scaled) iron concentration (which ever is smaller) and thus directly represents nutrient concentrations.
The biomass proxy decreases in most areas of the world ocean (Fig. 4d). This can be attributed to a more efficient utilization of nutrients under global warming as found in previous work (e.g. Plattner et al., 2001;Frölicher and Joos, 2010). Reduced nutrient concentrations in combination with reduced export are indicative of reduced nutrient input from the thermocline into the mixed layer. Such conditions prevail in the Atlantic between 20 • S and 65 • N, in the western part of the Indian Ocean, and around 30 • N and 35 • S in the Pacific between 160 • E and 140 • W. PP shows little or no response to climate change in the tropical and subtropical Pacific, where PP is low due to an unrealistically strong iron limitation. On the other hand, sea ice retreat and warming in the Arctic alleviate the strong limitations by light and temperature and enhance Arctic PP. Similarly, a reduction in temperature limitation boosts PP around Antarctica in the model.  In the North Atlantic, where the largest PP changes occur, the PP decrease is dominated by a decrease in the biomass proxy. Nutrients are used up more efficiently, and PP decreases likely in response to less surface-to-deep exchange, which is linked to a reduction in the North Atlantic thermohaline circulation (Frölicher et al., 2009) and a reduced deep wintertime convection. The model also simulates an increase in light limitation, mainly caused by the decrease in mixed layer depth, and a somewhat stronger limitation by iron in the east and by phosphate in the west. The slight increase in PP in some areas in the Indian Ocean, around Australia, and in the South Atlantic around 25 • S can mainly be attributed to an increased nutrient supply due to stronger upwelling.
In conclusion, PP changes in the CSM1.4 model are tightly linked to changes in nutrient input into the euphotic zone in combination with an alleviation of light and temperature limitations in high latitudes. A reduced nutrient input into the surface is expected in climate change scenarios as surface stratification tends to increase in response to warming and freshening. Next, we will investigate changes in physical www.biogeosciences.net/7/979/2010/ Biogeosciences, 7, 979-1005, 2010 factors such as stratification and upwelling as well as in nutrient availability and their link to PP for all four models.

Basin-scale changes in productivity, physical properties, and nutrient concentrations
There is a surprisingly good overall consistency in projected trends among the models on the basin-scale and for a range of variables. Figure 5 shows projected changes in selected large Biogeosciences, 7, 979-1005, 2010 www.biogeosciences.net/7/979/2010/ regions for PP, EP, related physical properties, and nutrient concentrations for all four models. This comparison between changes in PP and in potential drivers is indicative of underlying mechanisms, albeit it does not allow for a stringent attribution as done in the previous section for the CSM1.4 model. Overall, the results are qualitatively consistent across models and regions. PP, EP, MLD, and surface nutrient concentrations are projected to decrease in all models and in almost all regions, while sea surface temperature (SST) and stratification increase. Next, we will show that the mechanisms identified for the CSM1.4 model are also key for the productivity changes in the IPSL, MPIM, and CCSM3 models. Namely, we find that a reduced nutrient input related to enhanced stratification, reduced MLD, and a slowed circulation tends to decrease PP and EP under transient global warming not only in the CSM1.4, but also in the other three models.
All models exhibit pronounced changes in MLD and stratification in the North Atlantic, which transform to strong reductions in surface macro-nutrient concentrations. Consequently, PP and EP decrease in the IPSL and CSM1.4 models by about 40% and 30%, respectively. In the CCSM3 model, PP is reduced by 13% and EP by 23%. In the MPIM model, preindustrial PP in the North Atlantic is unrealistically small due to too strong iron limitation and the 21st century reduction in PP is thus small as well.
All models show an increase in stratification and a decrease in MLD and macro-nutrients in the stratified ocean (SST> 15 • C). We again link this tentatively to a reduced nutrient input into the euphotic zone under global warming. Productivity and export decrease accordingly in all models.
In the Southern Ocean (<45 • S), relative PP trends are smaller than in other regions and vary in sign between different regions within the Southern Ocean. Changes that favor production, such as increased SST and light, and changes that tend to reduce production, such as reduced nutrient input, balance to some extent on the regional average. In the IPSL, CSM1.4, and CCSM3 simulations, PP increases on average, while MPIM shows a decrease of about 5%, which is probably linked to the very strong decrease in MLD. The CCSM3 model projects a relatively large PP increase around 40 • S which results from the combination of a moderate to strong increase in SST and a reduction in nutrient limitation in some areas in that region. To some extent, this feature is also present in the CSM1.4 and IPSL simulations.
There are also some qualitative inconsistencies in projected trends between models. Most notable are the following three. (1) IPSL simulates a decrease in PP and EP in the Arctic Ocean, in contrast to MPIM, CSM1.4, and CCSM3 that project an increase (Fig. 5). (2) Surface iron concentration is projected to increase in all regions in IPSL and on global average in CCSM3, while MPIM and CSM1.4 project a decrease in most regions (Fig. 5). (3) CCSM3 projects an increase in PP in the central Pacific between 10 • S and 20 • N (Fig. 2), whereas the other models simulate a decrease.
In the Arctic Ocean, light availability in the surface ocean is strongly enhanced in all models due to sea ice retreat. The annual mean sea ice cover in the Arctic is reduced by 32% (IPSL), 25% (MPIM), 23% (CSM1.4) and 20% (CCSM3) with respect to preindustrial conditions. This leads, together with an increase in SST and MLD, to a strong increase in PP and EP in the MPIM (+130%), CSM1.4 (+215%), and CCSM3 (+150% for PP; +200% for EP) simulations, despite the strong (+90% in CSM1.4; +80% in CCSM3) and moderate (+20% in MPIM) increase in stratification and reduced surface nutrient concentrations. Although insolation and SST increase also strongly in IPSL, this model shows an opposite response in PP and EP. This can be explained with a strong increase in stratification of about 90% and the reduction in MLD and surface macro-nutrients of 50-70%.
The increase in surface iron concentration simulated by the IPSL model (20% in the global mean) is a consequence of the parametrization of the elemental ratio in phytoplankton. The ratio between carbon and nitrogen or phosphorus is kept constant. In contrast, the iron-to-carbon ratio of phytoplankton is assumed to decrease with increasing nutrient (and light) limitation. Consequently, lower macro-nutrient concentrations in the euphotic zone lead to a relatively lower uptake of iron compared to other nutrients by plankton and to a lower iron-to-carbon ratio in organic material. In turn, less iron is exported out of the euphotic zone and iron concentrations increase, while macro-nutrient concentrations decrease. In the IPSL model, surface iron concentrations are restored to a minimum value of 0.01 nM. This influences the interannual variability in PP (Schneider et al., 2008). However, this potential artificial iron source does not contribute significantly to the long-term trend in surface iron because, first, the number of grid cells and months where iron is restored is reduced during the simulation, and second, these regions do not correspond to the regions where large changes in surface iron are simulated. In the CCSM3 model, the iron-to-carbon ratio is also variable but this has only a small effect on surface iron concentrations. In the CSM1.4 and MPIM model, the iron-to-carbon and other elemental ratios are constant. Generally, surface iron tends to increase in regions with substantial aeolian iron input and increased stratification or reduced mixed layer depth, whereas it tends to decrease in parallel with macro-nutrient concentrations in the surface ocean in regions with little iron input. This leads to an increase in global mean surface iron of 4% in the CCSM3 model, while CSM1.4 and MPIM project a slight decrease of about 2%. In contrast to the IPSL model, these three models all project a decrease in surface iron in the Southern Ocean and in the Arctic.
Nutrient and light limitation factors are output variables of the CCSM3 model and therefore allow the direct attribution of changes in PP to changes in these factors.  almost no change in macronutrient limitation and a moderate increase in light limitation. In this region, PO 4 and NO 3 concentrations are reduced in response to large changes in stratification and MLD and increased export, but they remain relatively high and PP limitation by macronutrients remains small. In contrast, a similar reduction in macronutrients leads to a significant increase in nutrient limitation around 30 • N, and consequently to a pronounced reduction in PP. We note that the model overestimates present-day phosphate concentrations in the Pacific south of 40 • N and that PP is too high in the eastern tropical and subtropical Pacific. Thus, the simulated macronutrient limitation might be too weak and the projected PP too high in this region. We also note that despite the moderate PP increase in the central equatorial Pacific, PP decreases slightly by 0.4% when averaged over the Pacific between 30 • S and 30 • N.  Fig. 2c, e, g, i).

Local correlations between changes in PP and potential drivers
In this section, we address to which extent the features identified on the basin-scale are also evident on the local scale. We correlate simulated changes in annual mean PP with annual mean changes in SST, stratification, MLD, and shortwave radiation, as well as with phosphate and iron for each single grid cell (Fig. 6) and compare projected changes along two transects through the Atlantic (and Arctic), and the Pacific (Figs. 7 and 8). The transects, indicated in Fig. 2, are selected to cover major PP features in the two basins. The results tend to confirm the findings from the two previous sections, although the links between stratification, mixed layer depth and macro-nutrient concentrations are often somewhat obscured on the grid cell scale as evidenced by the small regression coefficient (R 2 ) found for many cells.
In the IPSL simulation, the PP decrease in the Pacific, North Atlantic and Indian Ocean correlates with enhanced stratification and decreased surface phosphate concentrations (Fig. 6). Changes in MLD correlate only weakly with PP trends; only in the North Atlantic and south-eastern Pacific are some relevant correlations found. Surface iron concentrations correlate positively with PP because surface iron increases almost everywhere in the IPSL simulation. Correlations for EP are similar (not shown).
The MPIM model shows generally weak correlations, which can be explained with the strong iron limitation in that model. Under present climate conditions, PP is ironlimited in all regions except the tropical Atlantic (Schneider et al., 2008). Because surface iron concentrations decrease only slightly in most regions, no significant correlations are found. Exceptions are the low and mid latitudes of the Pacific, where surface iron concentrations decrease by about 20% and correlations of PP changes are found with surface iron (mainly in the subtropical gyres). Also, the PP decrease in the western tropical Pacific correlates with increased stratification and reduced MLD.
In the CSM1.4 simulation, increased stratification correlates to some extent with reduced PP and EP in the tropical and southern Pacific, as well as in the North Atlantic. This model shows a stronger correlation between PP and MLD than the other three. The latter may be an artifact of the model formulation for light limitation. Significant positive correlations are found in the North Atlantic, North Pacific, and in the Southern Ocean. Reduced surface nutrient concentrations mainly correlate where the respective nutrient is limiting; PO 4 in the low-and mid-latitude Atlantic and in the northern Indian Ocean, iron in the Pacific and southern Indian.
In the CCSM3 simulation, increased SST, enhanced stratification and reduced surface phosphate correlate with reduced PP and EP in the tropical and North Atlantic, in the Pacific around 30 • N, and in the northern part of the Indian Ocean. There is almost no significant correlation between changes in MLD and PP. Enhanced PP at high latitudes and to some extent also in the tropical Indian and Atlantic correlates with increased light availability.
In conclusion the multi-model analysis confirms important conclusions obtained by attributing changes in PP and EP to individual drivers in the CSM1.4 model. We identify two different regimes for PP and EP changes in all models. First, a decrease in the concentrations of the limiting nutrient in combination with a decrease in EP is indicative of reduced nutrient input from the thermocline into the mixed layer. This first regime is dominant in the low-and mid-latitude ocean and in the North Atlantic in all four models and in the Arctic for the IPSL model. This regime is for example indicated by the positive slope between productivity (PP and EP) and limiting nutrient (yellow and red color in the panels for PO 4 and Fe in Fig. 6) and the negative slope between PP and stratification (blue color in the STRAT panel of Fig. 6) in areas where productivity is decreasing. For the second regime, an alleviation of light and temperature limitation leads to an increase in PP and EP, while PP and EP is fueled by a sustained or even increased nutrient input into the euphotic zone. This second regime is found in the Arctic in the CSM1.4 and MPIM model, to some extent in the tropical Pacific in the CCSM3 model, and in parts of the Southern Ocean in all four models. Globally, the first regime is most important and global PP and EP decreases in our 21st century global warming simulations.

A weighted multi-model mean of projected PP changes
In the previous sections, it is shown that the models as a class represent most of the pertinent features also seen in the satellite-based PP estimates and that the underlying mechanisms for changes in PP are broadly consistent across the range of models. However, individual models clearly fail to represent certain regional features. The challenge is to combine the information from several models into a quantitative projection. In the assessments of the Intergovernmental Panel on Climate Change this has been achieved by averaging the results from individual models (Meehl et al., 2007). In this way, each model, whether skillful or not, is given equal weight. Obviously, such an approach is less than ideal as unrealistic features of a particular model influence the multi-model mean. For example, if one of the models simulates rainfall in a desert region, the multi-model mean will also show rainfall in the desert. An alternative would be to rely on the model with the best skill score with respect to suitable observations. However, this seems also less than ideal as each model has certain weaknesses and useful information from the other models is lost. Here, we suggest the use of regional skill scores as weights to compute a "best" or "optimal" estimate of projected changes. The goal is to take advantage of the skill of individual models in simulating regional features and to exclude or minimize the influence of regional results where a model is in conflict with observational evidence.
Technically, the multi-model mean is computed following the skill score metric developed by Taylor (2001). For each model m and grid cell at coordinates (i,j ) a skill score is calculated (Taylor, 2001), where R i,j is the distanceweighted correlation coefficient between the satellite-based estimates (PP obs ) and the simulated PP (PP m ; average 1998-2005) and σ i,j is the corresponding standard deviation normalized by the standard deviation of the observations. This metric penalizes models that have normalized standard deviations either greater than or less than one by reducing the skill score. The weights are calculated using a twodimensional Gaussian function A(x,y) x,y A(x,y) , where x i,j and y i,j are the longitude and latitude of the grid cell (i,j ), A(x,y) is the area of the grid cell at coordinates (x,y), and ρ = 10 • characterizes the width of the distribution (the distance at which the weight has decreased from one to 1/ √ e). We note that the results are not sensitive to the exact choice of ρ. The multi-model mean then is calculated in proportion to these regional skill scores (Fig. 9a-c): Where no observation-based data is available to calculate a skill score (e.g. in the Arctic) the model results are averaged using equal weights. The above skill score metric emphasizes pattern similarities, but does not penalize offsets between the mean of the fields. Therefore, we also investigate an alternative metric, E, based on mean square errors: x,y w(x,y) i,j (PP obs (x,y) − PP m (x,y)) 2 (6) . The multi-model means have been computed by using the regional skill scores shown in Fig. 9 as weights. The dotted areas indicate that none of the regional skill scores is higher than 0.5. Where no observationbased data is available to calculate skill scores (e.g. in the Arctic) the arithmetic mean of the model results is shown.
The weights w(x,y) i,j used here are the same as given above. The multi-model mean with this second metric is calculated as In addition, we have computed the arithmetic mean from all models (PP ave ) as well as the mean obtained by weighting individual models with their global (ρ = ∞) skill score (PP S glob ). Next, global skill scores (S glob ) and global root mean square errors (RMSE) are computed for the individual model results and for the multi-model fields obtained by the four different averaging methods (Table 1). The global skill score for the first field (PP S ) is considerably higher than for the others. All averaging methods result in a lower global skill score than that of the two best models (IPSL and CCSM3). However, the RMSE is lower for the PP S field than for each individual model and for the other multi-model fields. In the following, we discuss results from this metric only. We note that differences in the results obtained by the first two metrics (PP S and PP E ) are generally small. This skill score method accounts for the different skills of the models at reproducing regional features of the satellite based estimates, while not degrading the overall skill in representing the satellite-based field compared to the best individual model. For example, the CSM1.4 model reproduces the high PP tongue around 40 • N in the North Atlantic. The IPSL model captures most of the high PP features along the coasts of South America and Africa. The MPIM model has a high skill in the central Pacific and the most realistic latitudinal extension of the equatorial PP belt, while the CCSM3 captures best the magnitude and pattern of PP around 40 • S. Therefore these models dominate the mean in those regions (Fig. 9d), and all these features are present in the multi-model mean (Fig. 10a). There remain weaknesses. All models underestimate PP in the Arabian Sea and off the west coast of North America. Consequently, the multi-model mean also misses these features. Overall, this method improves the multi-model mean significantly compared to simpler averaging methods (Table 1).
Regional skill scores are applied to calculate the multimodel mean of preindustrial PP and of the projected changes by the end of the 21st century (Fig. 10) and as a function of the global mean surface air temperature (SAT glob , Fig. 11d). The globally integrated annual mean PP decreases from 37.1 GtC yr −1 (preindustrial) to 33.0 Gt C yr −1 by 2100 AD (−2.9 Gt C yr −1 ; −8%) for the multi-model mean (Fig. 1, Table 1). Large decreases in PP are projected for the North Atlantic, off the coast of Africa in the South Atlantic, in the Pacific around the equator and around 30 • , and in the northern part of the Indian Ocean; a slight increase in PP is found in the Southern Ocean and in the Arctic (Fig. 10b). Calculating the mean by 2100 has the disadvantage that PP changes are merged that correspond to different temperature changes as the models have different climate sensitivities. One way to avoid this is to calculate the regression slope PP/ SAT glob for each grid cell (Fig. 11a-c) as done for the global PP in Fig. 1c. The patterns of the resulting PP change per centigrade SAT increase are broadly consistent with the patterns of the projected PP change by 2100. Biogeosciences, 7, 979-1005Biogeosciences, 7, 979- , 2010 www.biogeosciences.net/7/979/2010/

Changes in the seasonal cycle
One aspect of the simulations to explore is how the seasonal cycle and interannual variability are modified under global warming. Here, we compare the simulated maximum seasonal PP amplitudes (annual maximum minus annual minimum) and their interannual variations for the decades 1860-1869 and 2090-2099 along the two sections in the Atlantic and the Pacific shown in Fig. 2 and for the global zonal mean (Fig. 12).
In the global zonal mean, the seasonal amplitude is projected to decrease everywhere in the IPSL simulation.   (not shown). Not only the seasonal amplitude, but also the interannual variability in PP is projected to decrease for most latitudes.
The zonally averaged seasonal PP amplitude in the MPIM simulation is also reduced between 70 • N and 60 • S. Largest reductions of about 200 mg C m −2 day −1 are located in the Southern Ocean and around the equator. South of 60 • S and north of 70 • N the seasonal amplitude increases, consistent with an increase in PP in these areas. The MPIM model exhibits a larger interannual variability than the other two models, and at most latitudes the projected changes are within the range of preindustrial interannual variability. Maximum changes in PP occur from December to February in the Southern Ocean and during July/August in the Arctic Ocean.
In the CSM1.4 model the zonally averaged seasonal PP amplitude is reduced by up to 300 mg C m −2 day −1 between 40 • N and 60 • N. An increase is found north of 60 • N, in the Southern Ocean (40 • S-60 • S), and in the Arctic Ocean. Changes are small in other regions. The changes in the north are dominated by the Atlantic where PP is strongly reduced between 40 • N and 60 • N (March-June) and enhanced between 60 • N and 70 • N (April-June).
In the global zonal mean, the seasonal amplitude projected by the CCSM3 model decreases between 30 • N and 70 • N by about 200 mg C m −2 day −1 , broadly consistent with the results of the other models. In the Arctic Ocean the amplitude increases by about the same amount. A slight increase is also found around 45 • N. The reduction around 60 • N can be attributed to changes in the North Atlantic, while the changes around 30 • N are dominated by the Pacific. In the topical Atlantic the amplitude tends to be reduced slightly.
In summary, changes in seasonal cycle amplitude are relatively small, though there are exceptions. The seasonal amplitude tends to become smaller when overall PP decreases. Interannual variability in the seasonal amplitude is substantial and projected to decrease in two of the four models (IPSL and CSM1.4).

Discussion and conclusions
The trends in ocean productivity in response to anthropogenic climate change have been analyzed with four coupled carbon cycle-climate models that incorporate marine biogeochemical-ecosystem models of different complexity. The decreasing trend in global net primary production (PP) Biogeosciences, 7, 979-1005Biogeosciences, 7, 979- , 2010 www.biogeosciences.net/7/979/2010/ and particulate organic carbon export (EP) is a robust result, but relative and absolute magnitudes differ among models and regions. The underlying mechanisms of change are qualitatively consistent across the models, except in the Arctic and in parts of the tropical Pacific. All four models show a consistent change in physical drivers, surface concentrations of macro-nutrients, and PP when considering regional averages (Fig. 5). Namely, the models project an increase in sea surface temperature and stratification in all regions and an increase in available light in the Arctic in response to sea ice retreat. Macro-nutrient concentrations in the euphotic zone are projected to decrease in all regions and for all models. Two different regimes for change in PP and EP are identified, that were already discussed previously in the literature (Bopp et al., 2001;Sarmiento et al., 1998). First, all models indicate a decrease in PP and EP in the low-and mid-latitude ocean and in the North Atlantic in response to reduced nutrient delivery to the surface ocean linked to enhanced stratification, reduced mixed-layer depth and slowed ocean circulation. This is broadly consistent with earlier projections using box models, Earth System Models of Intermediate Complexity or general circulation models (Klepper and De Haan, 1995;Maier-Reimer et al., 1996;Joos et al., 1999;Matear and Hirst, 1999;Plattner et al., 2001;Bopp et al., 2001;Fung et al., 2005;Frölicher et al., 2009). Second, light and temperature limitation is reduced in the high-latitude ocean, whereas nutrient supply remains sufficient to support an increase in PP and EP. This second regime is found in the Arctic in the CSM1.4 and MPIM model and in parts of the Southern Ocean in all four models. A qualitative difference among models is found in the Arctic, where IPSL projects a decrease in PP and EP related to a reduced supply of macronutrients, whereas CSM1.4, MPIM, and CCSM3 project a PP and EP increase due to reduced light and temperature limitation. In any case, absolute changes in PP in the Arctic and the Southern Ocean are relatively small in the IPSL, MPIM, and CSM1.4 models. An exception is the CCSM3 model where the PP increase in the Arctic and Southern Ocean is of the same order of magnitude as the decrease in the tropics and in the North Atlantic. This explains the relatively small decrease in PP on the global scale projected by that model.
The models project also a different evolution of iron. The MPIM and CSM1.4 models use constant elemental ratios in their production algorithms and consequently surface iron concentration are decreasing in parallel with macro-nutrient concentrations in regions without substantial aeolian iron input. In the IPSL model, the iron-to-carbon ratio of assimilated material is reduced under nutrient stress. As a consequence, iron concentration increases in the euphotic zone as less iron is exported to depth in the form of organic matter. The CCSM3 model also features a variable iron-to-carbon ratio but the effect on surface iron concentrations is rather small and changes are mainly driven by physical processes such as increased stratification.
The cycling of iron is quantitatively not well understood and difficult to represent in ocean models. It involves a temporally and spatially variable aeolian dust source, sediment sources, as well as complex physical and chemical processes such as complexation to organic ligands and scavenging by particles (e.g. Parekh et al., 2004). All models in this study have atmospheric iron deposition and the IPSL and CCSM3 models also have a sediment source. Sensitivity studies with the ocean only model of MPIM showed a decrease in EP of 0.4 GtC in response to a 30% decrease of dust deposition in a 2×CO 2 climate as predicted by Mahowald et al. (2006). Tagliabue et al. (2008) found a NPP reduction of about 3% in response to a 60% reduction in iron input from dust in a 21st century simulation with the IPSL model. Variations in the dust input of iron significantly impact nitrogen fixation, export production, and air-sea CO 2 exchange in the CCSM3 model Moore and Braucher, 2008;Doney, et al., 2009b). This suggests that there may be a significant even though not first order impact on atmospheric CO 2 . The predicted change and even the sign of change in dust deposition, however, are still highly uncertain and, therefore, dust climatologies were kept constant in this study.
None of the models used here explicitly represents bacterial pools. The microbial loop describing the energy and nutrient flow initiated by bacterial consumption of dissolved organic matter and grazing by bacterivores is implicitly represented in the models. Dissolved organic matter is assumed to decay and released nutrients are then available to fuel productivity. Three of the models show a lower global PP than observation-based estimates and one might be tempted to link to the low productivity to the missing explicit representation of bacteria. However, the PP of 49 Gt C yr −1 simulated by the CCSM3 model falls well within the satellite-based range of 35 to 70 Gt C yr −1 and the PP of the IPSL model is with 34 Gt C yr −1 only slightly lower than the satellite-based range. As already discussed, PP in the MPIM model is too strongly limited by iron and the simple empirical formulation of productivity in the CSM1.4 is biased low by design. Taken together, this suggests that the missing explicit representation of bacteria does not necessarily cause an underestimation of PP.
The formulation of the temperature dependency of growth rates and other processes vary qualitatively and quantitatively among the four models. An exponential temperature dependency is used for growth rates, microzooplankton grazing, and POC remineralization in the IPSL and CCSM3 models with Q 10 values of 1.9 and 2.0, respectively. This is comparable to the temperature dependency for growth rates proposed by Eppley (1972) and applied by . In contrast, growth rates are temperatureindependent in the MPIM model, thereby assuming that phytoplankton acclimate to local temperature. The temperature limitation for productivity in CSM1.4 corresponds formally to a Michaelis-Menton type formulation (Eq. 2) and has a concave shape in contrast to the exponential-type formulations. In other words, the nominal sensitivity of PP to a temperature change is highest at low temperatures in CSM1.4, in contrast to CCSM3 and IPSL where the sensitivity of growth rates is highest in the warm ocean. Interestingly, global PP is decreasing in all four models under global warming, in contrast to the model of . The IPSL model with the highest temperature sensitivity for growth rates and a realistic relationship between the export ratio and temperature (Fig. 3) yields the largest decrease in PP per nominal change in surface temperature (Fig. 1c). Apparently, increasing nutrient limitation is more important in regulating PP on the global scale than the direct temperature effect on growth rates in the IPSL model and in the other models applied here.
Quantitatively, the four models show large differences in regional responses. These are often linked to differences in the simulation of the mean PP fields. For example, iron limitation is too strong in the MPIM in the low and midlatitude ocean and in the CSM1.4 model in the equatorial Pacific. Consequently, PP in these regions is very low for these models and the projected decrease is also small by necessity. Other differences are related to the climate sensitivity of the models. The CSM1.4 model has the smallest climate sensitivity and shows a smaller surface warming and smaller changes in low-latitude stratification than the IPSL and MPIM model. The comparison between observationbased PP estimates and simulated PP ( Fig. 2; Schneider et al., 2008) suggests that it is not advisable to simply average the results from the four models as obvious shortcomings of the models would unfavorably influence the multi-model mean projection.
We have applied regional model skill metrics as weights in the computation of multi-model means. Here, we have used the satellite-based PP estimates (average of annual mean PP for the period 1998 to 2005) of Behrenfeld et al. (2006) as an example target against which the performance of individual models is assessed; in the future it might be useful to compare models to the ensemble of satellite-based reconstructions of PP and chlorophyll given their uncertainties. Other metrics, such as how well the models reproduce current surface nutrient distributions, could be used as additional targets (Doney et al., 2009a). A scale length is introduced for the regional skill score calculation that can be adjusted for the problem considered. Here, the scale length has been selected to be representative for the spatial scale of marine biogeographical provinces (≈10 • ); the exact choice of the numerical value is not crucial for our application. The multi-model mean PP changes are expressed as PP change per a nominal increase in global mean surface air temperature of 1 • C to account for the different climate sensitivities of the models. The use of regional metrics has advantages. It results in an improved skill in representing the satellite-based PP field compared to a conventional, IPCC-type multi-model average where each model is given equal weight. Most weight is at-tached to the model that represents an individual regional feature best, whereas little weight is attached to the models that fail to reproduce the regional feature. The regional metrics quantify the regional performance of each model (Fig. 9). Features that all models fail to represent as evidenced by low skills can be flagged in the multi-model average. Disadvantages are that suitable target fields have to be defined and scale lengths to be determined. The choice of an annual mean climatological field as a target is debatable. Additional targets including seasonal or interannual variability (Santer et al., 2009) may be applied. Most preferable would be observation-based data that include decadal scale trends when evaluating projections of the 21st century. Further, our approach, as any weighting scheme, is based on the assumption that the relative skill of the model remains about the same over time. A more fundamental caveat is worth mentioning. Each individual model provides an internally consistent representation of heat and mass fluxes, nutrient cycling, and ecosystem dynamics taking fully into account first order principles such as mass and energy conservation. By using regional weights, regional features from different models are combined to a new global mean field which may lack internal consistency. We believe that our regional weight approach is preferable compared to the conventional "one model, one vote" approach to generate a multi-model mean projection of PP. However, we caution that this might not be the case for other applications.
Our results are contradictory to the results of Sarmiento et al. (2004) on the global scale and in most regions (Fig. 13). Sarmiento et al. (2004) project an increase in global PP by 0.7 to 8.1% and not a decrease. These authors rely on an empirical model approach in combination with output for physical variables from AOGCM global warming simulations. The cycling of nutrients and nutrient concentrations are not explicitly considered. Seven physics-based diagnostics (surface temperature, salinity and density, upwelling and vertical density gradient in the top layers, mixed layer depth, and ice cover) are used to define 33 biogeographical provinces. An empirical chlorophyll model, describing chlorophyll as an exponential function of temperature, salinity, mixed layer depth and growing season length, is fitted to the SeaWiFS chlorophyll data for each province and used to project 21st century changes in chlorophyll from the AOGCM output. Finally, PP is estimated from the chlorophyll concentration for three different productivity algorithms (Behrenfeld and Falkowski, 1997a;Carr, 2002;Marra et al., 2003). This chain of models yield an increase in PP almost in the entire ocean for the Marra et al. algorithm and, to a lesser extent for the Carr algorithm, whereas the Behrenfeld and Falkowski algorithm yields a decrease in PP in low and mid latitudes and an increase in high-latitudes. Only the projected decrease in low and mid latitudes with the Behrenfeld and Falkowski algorithm is consistent with this and an earlier analysis with the IPSL model (Bopp et al., 2001). On the global scale, the increase in PP projected by Sarmiento et al. (2004) Fig. 11 in Sarmiento et al., 2004) and simulated with the mechanistic models IPSL, MPIM, CSM1.4, and CCSM3 (right). In the left panel the productivity is calculated with the three different primary production algorithms of Behrenfeld and Falkowski (1997a, B&F), Carr (2002), and Marra et al. (2003). The multi-model mean shown in the right panel (cyan) has been calculated using regional skill scores.
range of about 0.2% to 2% per nominal change in surface air temperature of 1 • C for the three algorithms. The corresponding range is −4% to −0.5% for the four models used here, whereas the model of  shows a more than five times larger sensitivity than inferred by the empirical approach. What are the reasons for the discrepancies between results from the empirical approach and those from processbased climate-biogeochemical-ecosystem models used in this study? A fundamental conceptual difference is that the cycling of nutrients and nutrient availability is explicitly considered in the process-based models, whereas nutrient limitation is only implicitly included in the empirical approach of Sarmiento et al. (2004) and the satellite productivity algorithms. As nutrients are a key factor for phytoplankton growth and PP, it appears necessary to take the decadal-to-century scale evolution of nutrient cycling into account as done in the process-based models. As discussed by Sarmiento et al. (2004), projected changes in chlorophyll are small for their empirical approach, and their changes in PP depend critically on the applied satellite algorithm. Sarmiento et al. (2004) highlight the importance of the assumed relationship between temperature and PP for a given chlorophyll concentration. This temperature sensitivity of PP is very different among the satellite algorithms. For example, PP increases with temperature by a factor of about two between 18 • C and 30 • C for the Marra et al. algorithm, but decreases by a factor of two over the same temperature range for the Behrenfeld and Falkowski algorithm. Consequently, PP is projected to decrease in low and mid latitudes with the Behrenfeld and Falkowski algorithm and to increase with the Marra et al. algorithm in transient warming scenarios. These discrepancies between algorithms may reflect the difficulties to separate light and nutrient effects on PP . We note that observation-based changes in global chlorophyll and inferred global PP by Behrenfeld et al. (2006) evolve in parallel. An implicit assumption in the empirical approach is that the spatial relationship between PP and physical forcing found for the modern ocean can be applied to temporal changes into the future. However, Schneider et al. (2008) find that the relationship between PP and temperature in the low-latitude ocean is different for interannual variations of the last decades and the century-scale trends in transient warming simulations.
Process-based models are far from perfect (Schneider et al., 2008) and their results must be interpreted with some caution. However, it appears evident from our analysis that the cycling of nutrients and changes in the supply to the surface and in the concentration of nutrients must be realistically represented to project changes in PP and EP with some realism. What is required for further progress is to combine satellite, field, and laboratory observations, empirical approaches and process-based models to further improve our quantitative understanding. Novel metrics such as (multivariate) regional skill scores may prove useful to synthesize results from models and observational studies in a quantitative and transparent way. As far as modeling is concerned, factorial experiments dedicated to quantify the link between www.biogeosciences.net/7/979/2010/ Biogeosciences, 7, 979-1005, 2010 PP and individual parameters will be helpful to improve the understanding of model behavior and to compare model results with experimental data. Improved parametrizations of ecosystem processes that take into account emerging results from field and laboratory studies are required to close gaps in understanding.