Measurement depth effects on the apparent temperature sensitivity of soil respiration in ﬁeld studies

. CO 2 efﬂux at the soil surface is the result of respiration in different depths that are subjected to variable temperatures at the same time. Therefore, the temperature measurement depth affects the apparent temperature sensitivity of ﬁeld-measured soil respiration. We summarize existing literature evidence on the importance of this effect, and describe a simple model to understand and estimate the magnitude of this potential error source for heterotrophic respiration. The model is tested against ﬁeld measurements. We discuss the inﬂuence of climate (annual and daily temperature amplitude), soil properties (vertical distribution of CO 2 sources, thermal and gas diffusivity), and measurement schedule (frequency, study duration, and time averaging). Q 10 as a commonly used parameter describing the temperature sensitivity of soil respiration is taken as an example and computed for different combinations of the above conditions. We deﬁne conditions and data acquisition and anal-ysis strategies that lead to lower errors in ﬁeld-based Q 10 determination. It was found that commonly used temperature measurement depths are likely to result in an underestimation of temperature sensitivity in ﬁeld experiments. Our results also apply to activation energy as an alternative


Introduction
Soil respiration is increasingly recognized as a major factor in the global carbon cycle. Due to a rising interest in the feedback between soils and climate change, numerous studies have provided relations between temperature and soil respiration either obtained in the laboratory or in the field. Typ-Correspondence to: A. Graf (a.graf@fz-juelich.de) ically, the temperature sensitivity of soil respiration is expressed as the Q 10 value, i.e. the factor by which respiration is enhanced at a temperature rise of 10 K (Appendix A).
Several restrictions to the significance of the Q 10 concept, especially if mistaken as a means to extrapolate soil CO 2 losses into a warmer future, have been brought up Tuomi et al., 2008). Here, we examine an additional restriction which has received remarkably little attention in literature. In most field studies, columnintegrated soil respiration and its sensitivity are quantified by a single temperature measurement, while the total flux is a sum of source terms from various depths, which are exposed to different temperature regimes. Because of the attenuation and phase shift of temperature fluctuations with increasing depth, the apparent Q 10 will depend on the temperature measurement depth. This possibility was mentioned first by Lloyd and Taylor (1994), but without quantification. Davidson et al. (1998) predicted that Q 10 values would increase with temperature measurement depth, and recognized that this complicates comparisons between studies. Recently, several field studies with multiple temperature measurement depths have been published (Xu and Qi, 2001;Hirano et al., 2003;Tang et al., 2003;Gaumont-Guay et al., 2006;Khomik et al., 2006;Shi et al., 2006;Wang et al., 2006;Pavelka et al., 2007). All of them show an increase of apparent Q 10 with depth. The same effect has also been identified in model simulations by Hashimoto et al. (2006), and demonstrated exemplary with synthetical data in a recent overview paper by Reichstein and Beer (2008). In a laboratory incubation, Reichstein et al. (2005a) found strongly differing temperature time series between two probe locations within the soil core, and used a multiple regession to consider both locations as sources.
To our knowledge, no explanations of the strongly varying shape of these relationships have been provided so far. In addition, it is unclear which Q 10 value, if any, is  Khomik et al. (2006) chamber forest (mixedwood), 1 month −1 b , Jul 2003-Jul 2005 boreal, ca. 360 m not in winter 7 Shi et al. (2006) chamber farmland (irrigated winter wheat), ≥1 month −1 , Sep 1999-Aug 2001 continental temperate, 3688 m ≥2 day −1 c 8 Wang et al. (2006) chamber forest (six different types), 2 week −1 Apr 2004-Oct 2005 continental monsoon, ca. 300 m 9 Pavelka et al. (2007) chamber grassland (9a), forest (9b), 80 min −1 3-9 Aug 2004 (9a) temperate, 850 m (9a), 890 m (9b) 19-24 May 2002 (9b) a results given separately for two sites b morning and afternoon of the measurement day in summer, once per day in transition months c on two days per month in summer, 8 times at some days most appropriate when temperature measurements at multiple depths are available. Tang et al. (2003), Perrin et al. (2004) and Shi et al. (2006) use the temperature measurement depth yielding the highest R 2 . Gaumont-Guay et al. (2006) suggest that the temperature-efflux curve with the lowest hysteresis indicates the most appropriate temperature measurement depth. Pavelka et al. (2007) also use the maximum R 2 method, but additionally performed a crosscorrelation analysis to align each depths temperature time series with the efflux. Since most studies use a single, more or less arbitrary, temperature measurement depth, the effect of varying temperature measurement depth is often not considered. The aim of this study is to quantify the error in Q 10 determination caused by different temperature measurement depths as a function of soil properties, climate, and measurement schedule. To this end, we present a simple model and validate it against field measurements of heterotrophic respiration. We consider this model as a tool that helps with the design of field studies with meaningful temperature measurement depths, and with a more appropriate interpretation of existing datasets.

Literature review
We found nine studies where multiple temperature measurement depths were used to derive apparent Q 10 depth profiles. An overview about the flux methods, site characteristics, and time schedules is given in Table 1.
Two of these studies use continuous CO 2 concentration profile measurements in the soil to calculate half-hourly surface CO 2 effluxes validated against chamber measurements. All other studies directly use a closed chamber system to measure CO 2 efflux. Many studies use a nested approach with one or more measurement days each month, and two to ten measurements per such day (Table 1). Some studies cover a period of less than a year, whilst others leave out the winter months for operational reasons.

Model
The model is based on the concept of thermal diffusion and is implemented in Fortran95. An overview of the model architecture is given in Fig. 1 and the theory behind the model is described in the Appendix. In brief, a simplified infinite near-surface temperature time series is generated using several distinct sine waves. The annual and diurnal cycle have a phase shift to correctly reproduce times of maxima and minima, assuming that t=0 is new year's midnight. A further cycle with a period of 12 h, a phase shift of 1 h, and an amplitude A=A diurnal /4 was used to mimic the skewness of the daily temperature cycle due to slow cooling during the night.
Variations of the diurnal amplitude and day length were not considered. The average temperature was set to the global average (15 • C) in the numerical experiments, and equalled the average measured temperature (12.7 • C) in the model validation. Input amplitudes are determined for the uppermost temperature sensor (0.5 cm) in the model validation. In the numerical experiments, amplitudes were provided for a reference depth of 5 cm. The reason is that amplitudes in this depth are more similar to air temperature than the soil surface temperature. Air temperature amplitudes are globally available and provide a more common reference than surface temperature.
The generated near-surface temperature time series is transferred to other soil depths using an analytical solution of the thermal diffusion equation (Appendix B). This solution does not consider time-variant thermal diffusivity. Instead, we use an effective thermal diffusivity representing the time averaged effect of soil moisture at each depth. On the other hand, time average effective thermal diffusivity may vary strongly with depth due to differences in soil properties and water content. To account for this, the analytic solution was applied in discrete depth steps of 1 cm, using the amplitudes and phase shifts in each layer to calculate those of the next deeper layer (Appendix B). The model is run with a time step of 1 h. Soil respiration is calculated from temperature using the Q 10 concept and, as an alternative, also using the Arrhenius concept (see Appendix A). The source strength of respiration at the average temperature is also given as a depth-dependent value. Here, only a relative vertical distribution is required because absolute values have no effect on the resulting apparent Q 10 profile.
If CO 2 diffusion time from each depth to the soil surface is assumed to be insignificant, the efflux can simply be calculated by integration of the respiration over all depths. However, in analogy to the impact of thermal diffusion on the apparent Q 10 discussed above, slow gas diffusion could also affect the apparent Q 10 . To test this hypothesis, we also included CO 2 diffusion in several model runs. As already proposed for heat diffusion, we use an effective diffusivity D CO 2 θ −1 a (Appendix C) invariant in time but vertically distributed. Because the concentration profiles are a result of the vertical source distribution and the nonlinear temperature dependence, CO 2 diffusion cannot be solved analytically. Therefore, we implemented a numerical solution (Appendix C). The CO 2 flux between two adjacent layers is now the product of diffusivity and the concentration gradient. We assume no vertical exchange between the lowest layer and the underground. At the surface, a constant atmospheric CO 2 concentration of 16.5×10 3 µmol m −3 is maintained. The model considering diffusion requires initialization of the concentration profile. Therefore, the model uses a spin-up period. The length of the spin-up period is considered adequate when the difference in cumulative efflux between runs with and without diffusion is less than 1%.
Finally, the modelled time series of efflux at the surface and temperature in each depth are used to simulate the current practice of field-based Q 10 determination. For each depth, regression of log-transformed efflux against temperature T is used to compute Q 10 . To also test fitting of the Arrhenius relation, the inverse of the temperature is plotted against log-transformed respiration. In this case, the resulting activation energy is converted into a Q 10 at the study's average temperature for comparison (cf. Sanderman et al., 2003).

Field measurements
An automated soil CO 2 flux chamber system (Li-8100, Li-Cor Inc., Lincoln, Nebraska, USA) was operated with four type T thermocouple thermometers at the FLOWatch project test site Selhausen of the Forschungszentrum Jülich. The test site is located in the river Rur catchment (50 • 52 09 N, 06 • 27 01 E, 104.5 m above sea level). The climate is warm temperate, the soil is an Orthic Luvisol and the texture is silt loam according to the USDA classification. A detailed description of the test site is given by Weihermüller et al. (2007). Organic carbon content was determined in vertical steps of 15 cm. In September 2006, the soil was tilled up to a depth of 15 cm and power harrowed. Bare field conditions were maintained by a repetition of this treatment in April 2007, several applications of glyphosate, and manual weed control at the efflux measurement plot. Historically, the field was annually ploughed to a depth of 30 cm, and the crop rotation was sugar beet -winter wheat. From 15 October 2006 to 24 April 2007 only one CO 2 flux system was used (closing interval every 30 min). From 24 April to 14 October 2007, four identical chambers with a separation of 20 cm were operated with the Li8100 multiplexer system (closing interval 15 min for each chamber). The soil flux chambers were placed on soil collars of 20 cm in diameter and a height of 7 cm, which were inserted 5 cm into the soil. The system was closed for two minutes for each flux measurement. CO 2 and water vapour concentration as well as chamber headspace temperature were measured every second, and the CO 2 concentration was corrected for changes in air density and water vapour dilution. The soil respiration was calculated by fitting a linear regression to the corrected CO 2 concentrations from 30 s after closing until reopening.
The thermocouples used to measure soil temperature have 1 mm thick unshielded joints to ensure a quick response, and were installed horizontally at 0.5, 3, 5, and 10 cm depth, 20 cm away from the chamber system. Temperature data were logged every second while the chamber was closed, and averaged. To vertically extend the empirical apparent Q 10 profiles, we also use temperature data of pF-meters (Ecotech, Bonn, Germany) in 15, 30, 45, 60, 90 and 120 cm depth, which were logged independently in 1 h intervals.
To obtain a uniform dataset, the efflux and temperature measurements were reduced to median hourly CO 2 flux and average hourly soil temperature at each measurement depth. In the case of CO 2 flux, the median was used because it is less sensitive to outliers and non-normal distributions. In the final data set, only those hours were considered where all flux and temperature measurements were available. Because more than 50% of the hours in December and January could not be considered due to power supply problems, these two months were completely excluded from the dataset.
To determine the effective soil thermal diffusivity, we derived the annual amplitude in each depth from average daily temperature, and applied the phase equation (e.g. Verhoef et al., 1996) to each pair of successive temperature measurement depths. Linear regression provided effective D T values for each depth increment (cf. Appendix B). Empirical apparent Q 10 as a function of temperature measurement depth z. Numbers refer to the study bibliography given in Table 1, single depth references are listed in the methods section. Depths >0 denote air temperature (height not to scale). Figure 2 shows apparent Q 10 values as a function of depth from this and other studies. An increase of apparent Q 10 with depth can be seen in all studies, but with a strongly variable slope. The highest apparent value (Gaumont-Guay et al., 2006, Q 10 =150 in a temperature measurement depth of 50 cm) is not shown for scaling reasons. This profile is based on measurements taken during two winter months. The second highest value was found by Khomik et al. (2006), also at 50 cm, in long-term measurements excluding winter months, but including snow cover situations in spring, and capturing the diurnal cycle in summer (Table 1). Of the remaining profiles, our own measurements and those by Shi et al. (2006), both from farmland and capturing the diurnal cycle, increase strongest with depth. The remaining profiles exhibit comparatively low, but still substantial apparent Q 10 increases with depth. In the study by Perrin et al. (2004), the air temperature 9 m above ground level is included and yields a considerably lower value than the three soil temperature series, which are close to each other both in measurement depth and in apparent Q 10 . The study by Pavelka et al. (2007), which used the shortest datset, shows an increase only up to a depth of 5 (grassland) or 10 (forest) cm, followed by a decrease for greater depths. Note that Pavelka et al. (2007) also provide Q 10 values based on a synchronization of each depth's temperature time series with efflux by crosscorrelation. In this case, the apparent Q 10 increases exponentially with depth, reaching an extremely high Q 10 value of 799 in 30 cm depth (grassland). The values from studies using a single temperature measurement depth also show Q 10 values increasing with depth. No single-depth study was found with a temperature measurement depth deeper than 10 cm.

Model validation
Figure 2 also shows the best model fit (RMSE of 0.16) obtained by fitting a depth invariant input Q 10 , while assuming a model domain of 50 cm, a homogeneous carbon source distribution within the plough layer (0 to 30 cm depth) and a carbon-free subsoil and neglecting CO 2 diffusion. The depth-invariant input Q 10 yielding this optimum fit was 5.9. We did not consider depth-dependent values of the input Q 10 in order to avoid over-fitting. It should be noted that the results were not substantially different when using an Arrhenius relationship instead of the Q 10 concept (not shown). This also applies to all results shown below. The model fit was less good when using the measured, linearly interpolated C org profile as a proxy of the source strength distribution. Increasing the length of the model domain to 120 cm also decreased model quality ( Table 2). The optimal input Q 10 values found for these different conditions vary from 5.3 to 6.2, and would have been directly measured in depths between 10 cm and 20 cm. Considering CO 2 diffusion either led to negligible differences or higher errors, depending on diffusivity (also see next sections).

Numerical experiments
The validated model was used to study the effect of several factors on the apparent Q 10 profile. Figure 3 shows apparent Q 10 values as a function of both temperature measurement depth and each factor considered in this study. The depth where the R 2 between soil respiration and temperature is highest is indicated with R 2 max . The input Q 10 used to generate all plots is 2.5.
In the case of a homogenous respiring A-horizon of varying thickness above a non-respiring subsoil (Fig. 3a), the input Q 10 is obtained at about half the depth of the respiring layer. The highest R 2 , however, is found at a shallower depth. The difference between the optimal measurement depth and the depth with the highest correlation increases with the thickness of the respiring layer (up to 10 cm for a 50 cm thick respiring layer). The apparent Q 10 at the depth of highest R 2 , however, does not differ more than 5% from the input value. Typical measurement depths used in field studies (0 to 10 cm) result in errors ranging from −30 to +10% depending on the depth of the respiring layer. The apparent Q 10 values shown in Fig. 3a vary from less than 1.8 to more than 3, which is about the range of most reported values (Raich and Schlesinger, 1992), although the input Q 10 was constant at 2.5. In all other plots (Fig. 3b to f), we assumed a respiring layer thickness of 30 cm.
The impact of the length of the measurement period is illustrated in Fig. 3b. For short periods (less than about 180 days), the apparent Q 10 behaves highly irregular. For measurement periods longer than a year, the apparent Q 10 is stable throughout the first 20 cm depth. It should be noted that we assumed that inter-annual variations in average temperature can be neglected here. All other plots are based on a 1 year measurement period.
Changing the thermal diffusivity of the soil (one value for all depths, Fig. 3c), yields an irregular behaviour for values less than 0.1 mm 2 s −1 . Above this threshold, possible apparent Q 10 errors, as well as the distance between the Q 10 obtained from the highest R 2 and the input Q 10 , decreases with increasing diffusivity. We used a thermal diffusivity of 0.5 mm 2 s −1 in all other plots.
The influence of CO 2 transport is neglected in all simulations except for those presented in Fig. 3d. Considering gas diffusion leads to an offset in apparent Q 10 in the first 20 cm compared to cases where diffusion is not considered, but the extent of this offset is less than 2% for effective diffusivities greater than 0.5 mm 2 s −1 . Below 0.5 mm 2 s −1 , this offset increases sharply and the depth of the highest R 2 can be found below rather than above the depth regaining the input Q 10 .
In Fig. 3e, the annual temperature amplitude was varied from 0 to 20 K (twice the value used in the other model runs). For annual amplitudes below the diurnal amplitude of 5 K, the resulting profile is highly irregular with a local maximum. In addition, the temperature sensitivity is underestimated throughout most of the modelling domain. Figure 3f shows the effect of varying diurnal amplitudes. High diurnal amplitudes increase the errors made within the first 20 cm, and lead to an underestimation of temperature sensitivity when using shallow temperature sensors. Zero diurnal temperature amplitudes yield an almost linear apparent Q 10 profile and a close proximity of the depth with the highest R 2 and the input Q 10 . Note that in our numerical experiments, this behaviour could be reproduced using daily averages of temperature and CO 2 efflux. Averaging efflux before or after log-transformation only resulted in negligible differences ( Q 10 <0.01). Simulating only one measurement per day at a fixed time also yields similar results, but with a small vertical offset of about 3 cm depending on the time of day of the measurement.
All experiments shown so far used a depth-invariant input sensitivity. Figure 4 shows the apparent Q 10 profiles resulting from a linear change of input Q 10 between the surface and the bottom of the respiring layer. As example minimum and maximum values, we use 2.5 (as in the previous experiments), and 4.6. These values have been identified in a study by Boone et al. (1998) for heterotrophic respiration excluding the rhizosphere and root-related respiration, respectively. They may thus represent a vertical gradient between 0 and 100% root contribution to total respiration, or a change in the quality of organic carbon pools. The experiment was performed using the same standard settings as Fig. 3, and was repeated for a thicker respiring layer of 120 cm. Descending and ascending profiles yield almost   identical results, differing mainly in a Q 10 offset of up to 0.21 in the upper 50 cm. The same is true for a depth-invariant Q 10 that is the arithmetic (higher value) or geometric average of the above gradient in discrete 1 cm steps. Figure 4 also shows the general effect of deep carbon (here, 120 cm) contributing the same reference temperature respiration as shallow horizons. In agreement with the trend in Fig. 3a, the depth regaining the input Q 10 moves further downwards. All shown measurement depths now underestimate temperature sensitivity. A combination of this situation with a short measurement period or low annual amplitude (not shown) aggravates this underestimation, making local Q 10 minima of less than 1 more probable. As a further example of combinatory effects, a low thermal conductivity of 0.1 mm 2 s −1 was combined with a varying measurement period. In this case, both very high Q 10 above 10 and values of less than 1 can be found in greater measurement depths, depending on the actual measurement period.

Literature and own field measurements
The variability of the Q 10 dependence on temperature measurement depth underlines the need for a methodology that allows comparison of temperature sensitivities determined in field experiments. Various explanations for the variability of apparent Q 10 profiles can be deduced from our modelling exercise. The highest reported apparent Q 10 ( Gaumont-Guay et al., 2006) is based on those authors' deepest temperature measurements and a short study period of two months. The amplitude of the diurnal temperature is strongly attenuated at that depth, and the amplitude of the annual cycle is not fully sampled because of the short measurement period. Therefore, CO 2 efflux was correlated to temperature values with small amplitude and high phase shift, which can result in very high or very low apparent Q 10 values. The even shorter dataset by Pavelka et al. (2007) gives an example of such very low apparent sensitivities at great depths. At the same time, it yields very high values if the synchronization procedure suggested by the authors is applied. This procedure eliminates any phase shift, by gas diffusion or inadequate temperature measurement depth. The second highest Q 10 increase with depth (Khomik et al., 2006) originates from a study capturing the daily temperature cycle in summer, with additional less frequent measurements in spring and autumn, and no measurements in winter. The steep profiles found by Shi et al. (2006) and by ourselves were obtained for agricultural soils. A high and dense vegatation canopy, which is absent in these sites, attenuates the diurnal cycle more than the annual one. The diurnal cycle will be attenuated stronger with depth than the annual one. Therefore in agricultural soils, with a higher diurnal amplitude at the surface, larger changes of temperature with depth are detectable. The lowest increase of Q 10 with depth was found in a study where measurements of the diurnal cycle of CO 2 efflux were avoided (Wang et al., 2006). The air temperature in proximity to the forest canopy included by Perrin et al. (2004) is supposed to have a higher diurnal amplitude than forest soil temperatures and consequently yields a lower apparent Q 10 .
Vegetation does not only affect the temperature regime of the soil, but also respiration itself. All studies discussed here except for our own bare soil measurements include both heterotrophic and root respiration. Hanson et al. (2000) review various studies on the contribution of root to total soil respiration. Depending on ecosystem, they find that 10 to 90% of total respiration stems from roots with an average contribution of about 50%. Root respiration is related not only to those environmental variables that are known to influence heterotrophic respiration, but also to aboveground plant productivity and thus to radiation (Tang et al., 2005). This correlation is subject to a lag between several hours and several days (Moyano et al., 2008), due to the time taken by phloem transport from leaves to roots. The similarity between this lagged response to radiation and soil temperature at a certain depth, which may also be considered a lagged response to radiation, could cause confusion. In the interpretation of mixed soil respiration, too much of temporal variability might be attributed to either soil temperature or aboveground radiation, depending on the normalisation procedure and the available temperature measurement depth. The possibility that conclusions about the lagged response to radiation of root respiration might be erroneous due to the temperature measurement depth effect, was recently discussed by Bahn et al. (2008). The model application to the field data demonstrates that the model is able to describe the temperature sensitivity variation with depth. The remaining uncertainty of about ±10% occurs when considering deeper layers, and their carbon content (Table 2). We attribute this to two main causes. First, temperature measurement errors become increasingly significant deeper in the soil, where amplitudes are smaller. Such errors are not simulated by the model. However, temperature sensitivity of soil respiration is rarely determined from temperature sensors installed in large depths. Second, there is considerable uncertainty in the source strength distribution. Organic carbon content includes accumulated stable carbon pools, the fraction of which can be depth-dependent itself. The field data were best described when neglecting the organic carbon content found below the A-horizon. This seems to indicate that deeper carbon is less involved in respiration activity, which is in good agreement with the general assumption that carbon pools in deeper horizons are more stable (cf. Fierer et al., 2003). The increasing uncertainty with depth also implies that field measurements of CO 2 efflux at the soil surface are not suited to derive the temperature sensitivity of deep buried carbon, which has been associated with higher temperature sensitivities by some (Knorr et al., 2005;. Recently, an additional sensitivity of deep carbon decomposition to fresh carbon supply has been suggested (Fontaine et al., 2007). Our study shows that although a true increase of Q 10 with depth may be present, it should not be confused with the temperature measurement depth dependence of the apparent Q 10 (also see Fig. 4).
It was not necessary to consider CO 2 diffusion to model the apparent Q 10 variation with depth for our field experiment. This fits well with the results of the numerical experiments discussed in the next section, which showed that for most diffusivities observed in the field the impact should be low ( Fig. 3d; Tang et al., 2003;Werner et al., 2004). Nevertheless, a general recommendation to neglect CO 2 transport should not be made based on the results of a single field study.
It is noteworthy that the measurement depths that would have yielded a Q 10 value in the range of the optimal input Q 10 of the model, are below 10 cm, while all single measurement depths found in our literature study are above that depth. When modelling a whole year, the apparent Q 10 differs less than 7% in the upper 30 cm and up to 16% in 50 cm depth. Given the ability of the model to describe the data measured during 10 months correctly, we assume that a full one-year dataset of hourly respiration would have shown the same deviation.
Finally, it should be mentioned that the model only considers the pure confounding factor temperature measurement depth. Depending on the site characteristics, other confounding effects, such as correlation of temperature with soil mois-ture (Davidson et al., 1998), may cause errors of similar magnitude in field-based Q 10 determination. In most climates, this correlation is negative, resulting in a further underestimation if CO 2 production is moisture-limited. However,  also demonstrated cases where the availability of other substrates may lead to an overestimation, e.g. oxygen influenced by moisture. Such other confounding factors may be responsible for the high Q 10 we found after correcting for measurement depth errors. As already stated, the model does not consider root respiration. Therefore, temperature measurement depth errors in soils with a considerable contribution of roots can only be described correctly if other factors controlling root-related respiration do not covary with temperature. This problem was already discussed in the previous section. Also, roots may contribute to the variety of temperature sensitivities found in a single site or even depth. Boone et al. (1998) found strongly differing temperature responses between heterotrophic and root-related (including exudation-driven heterotrophic) respiration (cf. next section).

Numerical experiments
When the vertical source strength distribution consists of a homogenous respiring layer above a non-respiring sub-soil, the best depth to place a single temperature sensor is the centre of the respiring layer (Fig. 3a). Although such a distribution is not unrealistic for our field reference dataset, it may be not fulfilled in non-agricultural soils, especially in the presence of litter layers. As an alternative method to determine the most appropriate depth, Tang et al. (2003), Perrin et al. (2004), Shi et al. (2006) and Pavelka et al. (2007) suggested the maximum R 2 criterion. Although our numerical experiments show that this is not exactly correct, it is a good approximation in most conditions. However, both the R 2 criterion and the centre placement fail in extreme conditions, as illustrated in Fig. 3b to e.
The difference between the depth of highest R 2 and the depth regaining the input Q 10 is a result of the combined effect of amplitude attenuation and phase shift of temperature waves. For an infinitely thin respiring layer, the R 2 is highest for a temperature measurement within this layer. This measurement will also provide the correct Q 10 . At other depths, the R 2 is lower due to phase shifts in the temperature time series. For thicker respiring layers, efflux at the surface integrates over CO 2 production time series with different delays and amplitudes. If the delay is considered in isolation, the highest R 2 would occur in the middle of the respiring layer. However, the apparent Q 10 would underestimate the temperature sensitivity for all depths because the averaging of several phase-shifted temperature waves results in a smaller range of temperature values. When amplitude attenuation and phase shifts are both considered, deeper parts of the respiring layer show a smaller variance in both, temperature and their contribution to column respiration. Therefore, the depth of highest R 2 is shifted upwards. At the same time, the lower temperature amplitudes in these depths counteract the underestimation of the apparent Q 10 . Strictly spoken, the temperature measurement depth regaining the input Q 10 is not a "correct" depth, but a depth where positive and negative errors are balanced. The depth that regains the input Q 10 will not always be within the respiring layer, as illustrated by Fig. 3b. In this figure, the length of the measurement period was varied. The model qualitatively confirms that extremely high apparent temperature sensitivities for greater measurement depths, such as those found by Gaumont-Guay et al. (2006) and Khomik et al. (2006), can be caused by incomplete representation of the annual cycle. For a more quantitative assessment, too little is known especially on the varying thickness and thermal properties of the snow cover, which was an important feature in both studies. Organic topsoils were reported from both studies, which may have had a very low thermal diffusivity. According to our model, this can lead to highly irregular apparent Q 10 profiles. The fact that measurement periods of less than half a year can result in high Q 10 errors is also relevant to studies separating the study period into seasons to capture plant phenological effects on temperature sensitivity (e.g. Xu and Qi, 2001;Yuste et al., 2004;deForest et al., 2006). The model also demonstrates that for even shorter measurement periods, such as the one analyzed by Pavelka et al. (2007), great measurement depths can yield very low apparent sensitivities.
Variation of the soil thermal diffusivity (Fig. 3c) confirms the expectation that accurate field-based Q 10 measurements are more likely when temperature waves propagate rapidly into the ground. According to Zmarsly et al. (2002), most soils have thermal diffusivities ranging between 0.1 (dry organic) and 0.75 mm 2 s −1 (wet sand). Therefore, the irregular behaviour of the apparent Q 10 for very low diffusivities is not relevant in most ecosystems.
Effective CO 2 diffusivities can cover a much larger range. A compilation of Werner et al. (2004) based on 81 studies shows that D CO 2 θ −1 a can range from 0.09 to more than 12 mm 2 s −1 . Despite this large range, our numerical experiment shows that the influence of diffusion on apparent Q 10 would be negligible for all but the three lowest values summarized by Werner et al. (2004). It is interesting that for such small diffusivities, the depth of highest R 2 can drop below the depth regaining the input Q 10 . We attribute this to the fact that the time series of surface efflux is now delayed compared to the temperature time series in those depths where most of the CO 2 is produced. Consequently, efflux correlates better with deeper temperature time series. This is no indication of a causal relationship, as the CO 2 produced in these depths is delayed even stronger before reaching the surface.
An evaluation of the effect of annual temperature amplitude (Fig. 3e) is relevant to avoid systematic errors when temperature sensitivities from different climatic zones are compared. Close to the equator where the annual amplitude is low, field-based determination of accurate Q 10 values is difficult. Typically, the temperature sensitivity will be underestimated. Continental and boreal climates with high annual amplitudes potentially allow an accurate determination of the Q 10 when the measurement period is long and continuous. This may be difficult in case of harsh winter conditions, or be complicated by the thermal properties of a snow cover (see above).
The numerical experiment on diurnal amplitude (Fig. 3f) is of particular interest because the positive effects of low diurnal amplitudes can be approximated by daily averaging of efflux and temperature time series. A similar reduction in daily amplitude can be obtained by measurements at a fixed time of day, but it remains to be examined whether this alternative is more susceptible to varying day lengths and amplitudes throughout the year.
The experiment on depth-variant input Q 10 confirms what has been discussed during the model validation: Surface efflux measurements are poorly suited to assess the vertical variability of temperature sensitivity. The Q 10 derived from such a field study, even if the correct measurement depth was chosen, only represents an effective mean of the potentially different sensitivities of soil horizons. It remains to be tested whether additional CO 2 concentration measurements in various depths, or varying vertical profiles of soil moisture, can solve this ambiguity problem.
In general, our analyses indicate that a temperature measurement depth within the upper 10 cm, as commonly used in field studies, is likely to result in an underestimation of temperature sensitivity, at least in the absence of a litter layer. According to the latest IPCC report (Solomon et al., 2007), most models used to estimate the biochemical feedback of land surfaces to climate change assume a soil respiration Q 10 close to 2. It is noteworthy that this assumption is based on averaging not only laboratory but also field studies (Solomon et al., 2007), e.g. those compiled by Raich and Schlesinger (1992). These models predict a global effective sensitivity of heterotrophic respiration of 6.2% per K warming. However, a larger Q 10 of 2.5 would be well within the uncertainty range identified in this study. This would increase global sensitivity by about one third in each model, which is the same order of magnitude as the standard deviation among the models. The models give an average absolute sensitivity of land surfaces to climate change of −79 Gt sequestered carbon per K warming, although this rate is highly variable between the models (±45 GtC K −1 ). An additional uncertainty of one third due to an unknown primary temperature sensitivity of respiration, divided by the time span over which such a 1 K increase is assumed (40 to 50 years depending on scenario), would be equal to 7 to 9% of the current annual emissions from fossil fuel burning and cement production.

Conclusions
We described the development, validation, and application of a simple model to explain and estimate the errors in temperature sensitivity determination related to the temperature measurement depth. We chose the widely used Q 10 concept as an example, but the alternative activation energy concept provides almost identical results.
Depending on study conditions, the vertical profile of the apparent Q 10 may range from fairly regular to highly irregular. The latter case can include local minima and maxima, decoupling of the depth of correct Q 10 from the depth of highest R 2 , and cases where the obtained Q 10 is incorrect for all conventional temperature measurement depths. In these cases, only laboratory incubation experiments directly can yield correct temperature sensitivity relations, although these experiments are not free of errors and assumptions either. An alternative possibility would be to inversely estimate the Q 10 using numerical models of CO 2 production, CO 2 transport and heat transport applied to field data. This approach has recently been used to estimate soil physical properties and CO 2 source strength Novak, 2007;Weihermüller et al., 2008) and could be extended to Q 10 estimation in future.
In many field studies, however, the detailed input data required to drive mechanistic CO 2 models are not available. In such cases, the model presented here, and some basic climate and soil data, may help reducing errors in temperature sensitivity analysis. Nevertheless, validation has shown that an uncertainty remains due to the choice of input parameters. Also, analyses of additional field data sets to test whether the simplifications made within the model are justified would be desirable. Ideally, a model to asses the effect of temperature measurement depth, as of other confounding factors, would accompany each field study. However, careful interpretation of the results presented here may provide some general conclusions which kind of conditions are favourable to reduce measurement depth errors. These are: -a thin and easily distinguished horizon of respiration activity, -a high thermal and CO 2 diffusivity of the soil, -a high annual temperature amplitude, -a measurement period of one year or more, -daily averaging of measurements before fitting the temperature sensitivity function.
Note that the last two conditions may be in conflict with other confounding factors that require short measurement periods, such as moisture or phenology-dependent Q 10 measurements.
In the conditions identified above, the bias introduced by the maximum R 2 depth method used by some authors will be small. In some cases, the aim of determining a temperature sensitivity is empirical modelling, e.g. for gap-filling, rather than inter-site comparison or process-based modelling. In this case, error minimization by choosing the depth of maximum R 2 may be advantageous.

Temperature sensitivity functions
Two methods are most commonly used to relate temperature and respiration. The first is an empirical exponential relationship suggested by van t'Hoff (e.g. Yuste et al., 2004): where SR is soil respiration (µmol m −2 s −1 ), T is temperature (K) and T ref is an arbitrary reference temperature with a know respiration rate SR T ref . Q 10 is the rate by which respiration changes with a temperature change of 10 K. The Q 10 is a commonly used parameter to report the temperature sensitivity of soil respiration. The second relationship is more physically based and uses activation energy considerations introduced by Arrhenius (e.g. Lloyd and Taylor, 1994): Here, E a is the activation energy (J mol −1 ), and R=8.314 J mol −1 K −1 is the universal gas constant. Further temperature sensitivity functions are summarised by Kätterer et al. (1998 and Tuomi et al. (2008). The temperature sensitivity coefficients of these methods (Q 10 and E a ) are not equivalent. For typical temperature and respiration ranges, a Q 10 value derived from Eq. (A2) based on E a decreases slowly with increasing temperature, whereas Q 10 is a constant in Eq. (A1). A slow Q 10 decrease with increasing temperature has been reported in a range of field and laboratory studies (e.g. Kirschbaum, 2006;Shi et al., 2006). Large differences between both relations only occur in the case of extrapolation, especially into warmer conditions. However, it has been questioned whether extrapolation can be used for future feedback prediction . One reason for this is that different soil carbon pools may have different temperature sensitivities. A long-term temperature change would then change the pool ratios and, consequently, the effective temperature sensitivity of the soil. It is still under debate whether these effects are of a measurable and relevant magnitude or not (Fang et al., 2005;Knorr et al., 2005;Reichstein et al., 2005b;Conen et al., 2006;Larinova et al., 2007).
A. Graf et al.: Temperature measurement depth effects Appendix B

Theory of soil temperature profiles
Soil surface temperature changes are mainly induced by the radiation balance at the soil surface and exchange of sensible and latent heat between the soil and the atmosphere. The variation in soil surface temperature propagates into deeper layers. In the absence of transport of sensible and latent heat in the soil gas phase (Weber et al., 2007), this process is controlled by the soil thermal diffusivity D T (m 2 s −1 ): where t is time (s) and z is depth (m). Thermal diffusivity is a function of thermal conductivity λ (W m −1 K −1 ), heat capacity c (J kg −1 K−1), and bulk density ρ (kg m −3 ). The typical order of magnitude of soil thermal diffusivity is 10 −7 to 10 −6 m 2 s −1 (Zmarsly et al., 2002). To transfer a soil temperature time series to another depth, it is often represented by a series of sine waves (van Wijk, 1963;Verhoef et al., 1996;Heusinkveld et al., 2004;Graf et al., 2008): where T denotes the average temperature (K), A i is the temperature amplitude (K), τ i is the period length (s), and t i the phase shift (here in units of time and therefore included in the bracketed term) of the sine wave indexed i. When thermal diffusivity is constant with depth and time, there is an analytical solution to Eqs. (B1) and (B2) (van Wijk, 1963) that predicts temperature in any other depth (Heusinkveld et al., 2004;Graf et al., 2008): where z is the difference between the actual and the reference depth.
Stepwise application of Eq. (B3) allows to treat thermal diffusivities that change along a vertical profile (cf. methods section). However, it should be noted that for such an effective thermal diffusivity in soils with a vertical change of thermal properties, the simple relation between λ, c and D T given in Eq. (B1) is no longer valid. Nassar and Horton (1989) describe a method yielding an effective diffusivity for numerical forward modelling. If no temperature time series from different depths in the field are available, but λ and c as determined in laboratory or estimated from literature significantly vary with depth, both approaches do not work. In this case, either a numerical model treating storage and diffusion separately has to be used, or a more complex analytical model considering the vertical profile of both λ and c.
Such models for specific, regular vertical profiles have been summarized and tested by Massman (1993). An even more general approach, which allows for any profile of thermal properties to be resolved in discrete stpes of e.g. 1 cm and is therefore well compatible with our model, has been suggested by Karam (2000).

Theory of gas diffusion
The dynamics of CO 2 in soil air is described by: where c is the total volumetric concentration of CO 2 , c a is the concentration in soil air, D a is the diffusivity of CO 2 in air (m 2 s −1 ), θ a (dimensionless) is the soil air content, and τ is a dimensionless tortuosity factor. D a , the soil air content, tortuosity and other factors such as transport through soil water and pressure turbulence can be combined into an effective diffusivity (Simunek and Suarez, 1993;Hirano et al., 2003;Tang et al., 2003;Takle et al., 2004). In this study, we use a wide range of field-determined effective diffusivities reviewed by Werner et al. (2004). To solve Eq. (C1), we use an explicit time discretization: c(t + t, z) = c(t, z) + t (SR(t, z) +D CO 2 (z − 1 2 z) c(t,z− z)−c(t,z) θ a z 2 −D CO 2 (z + 1 2 z) c(t,z)−c(t,z+ z) θ a z 2 ) (C2) By defining D CO 2 in planes 0.5 z above and below all other depth-dependent input data, we achieve mass-consistency. The maximum value of the time-step for a stable solution is t<0.5 z 2 D −1 CO 2 θ a .