<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE article SYSTEM "http://www.biogeosciences.net/inc/bg/copernicus.dtd">
<article language="en">
	<journal>
		<journal_title>Biogeosciences</journal_title>
		<journal_url>www.biogeosciences.net</journal_url>
		<issn>1726-4170</issn>
		<eissn>1726-4189</eissn>
		<volume_number>6</volume_number>
		<issue_number>10</issue_number>
		<publication_year>2009</publication_year>
	</journal>
	<doi>10.5194/bg-6-2001-2009</doi>
	<article_url>http://www.biogeosciences.net/6/2001/2009/</article_url>
	<abstract_html>http://www.biogeosciences.net/6/2001/2009/bg-6-2001-2009.html</abstract_html>
	<fulltext_pdf>http://www.biogeosciences.net/6/2001/2009/bg-6-2001-2009.pdf</fulltext_pdf>
	<start_page>2001</start_page>
	<end_page>2013</end_page>
	<publication_date>2009-10-06</publication_date>
	<article_title content_type="html">Towards global empirical upscaling of FLUXNET eddy covariance observations: validation of a model tree ensemble approach using a biosphere model</article_title>
	<authors>
		<author numeration="1" affiliations="1">
			<name>M. Jung</name>
			<email>mjung@bgc-jena.mpg.de</email>
		</author>
		<author numeration="2" affiliations="1">
			<name>M. Reichstein</name>
		</author>
		<author numeration="3" affiliations="2">
			<name>A. Bondeau</name>
		</author>
	</authors>
	<affiliations>
		<affiliation numeration="1" content_type="html">Max Planck Institute for Biogeochemistry, Jena, Germany</affiliation>
		<affiliation numeration="2" content_type="html">Potsdam Institute for Climate Impact Research (PIK), Potsdam, Germany</affiliation>
	</affiliations>
	<abstract content_type="html">Global, spatially and temporally explicit estimates of carbon and water
fluxes derived from empirical up-scaling eddy covariance measurements would
constitute a new and possibly powerful data stream to study the variability
of the global terrestrial carbon and water cycle. This paper introduces and
validates a machine learning approach dedicated to the upscaling of
observations from the current global network of eddy covariance towers
(FLUXNET). We present a new model TRee Induction ALgorithm (TRIAL) that
performs hierarchical stratification of the data set into units where
particular multiple regressions for a target variable hold. We propose an
ensemble approach (Evolving tRees with RandOm gRowth, ERROR) where the base
learning algorithm is perturbed in order to gain a diverse sequence of
different model trees which evolves over time.
&lt;br&gt;&lt;br&gt;
We evaluate the efficiency of the model tree ensemble (MTE) approach using
an artificial data set derived from the Lund-Potsdam-Jena managed Land
(LPJmL) biosphere model. We aim at reproducing global monthly gross primary
production as simulated by LPJmL from 1998–2005 using only locations and
months where high quality FLUXNET data exist for the training of the model
trees. The model trees are trained with the LPJmL land cover and
meteorological input data, climate data, and the fraction of absorbed
photosynthetic active radiation simulated by LPJmL. Given that we know the
&quot;true result&quot; in the form of global LPJmL simulations we can effectively
study the performance of the MTE upscaling and associated problems of
extrapolation capacity.
&lt;br&gt;&lt;br&gt;
We show that MTE is able to explain 92% of the variability of the global
LPJmL GPP simulations. The mean spatial pattern and the seasonal variability
of GPP that constitute the largest sources of variance are very well
reproduced (96% and 94% of variance explained respectively) while the
monthly interannual anomalies which occupy much less variance are less well
matched (41% of variance explained). We demonstrate the substantially
improved accuracy of MTE over individual model trees in particular for the
monthly anomalies and for situations of extrapolation. We estimate that
roughly one fifth of the domain is subject to extrapolation while MTE is
still able to reproduce 73% of the LPJmL GPP variability here.
&lt;br&gt;&lt;br&gt;
This paper presents for the first time a benchmark for a global FLUXNET
upscaling approach that will be employed in future studies. Although the
real world FLUXNET upscaling is more complicated than for a noise free and
reduced complexity biosphere model as presented here, our results show that
an empirical upscaling from the current FLUXNET network with MTE is feasible
and able to extract global patterns of carbon flux variability.</abstract>
	<references>
		<reference numeration="1" content_type="text"> Akaike, H.: A new look at the statistical model identification, IEEE Transactions on Automatic Control, 19(6), 716–723, 1974. </reference>
		<reference numeration="2" content_type="text"> Bates, J. M. and Granger, C. W. J.: The combination of forecasts, Operations Research Quarterly, 20, 451–468, 1969. </reference>
		<reference numeration="3" content_type="text"> Bondeau, A., Smith, P. C., Zaehle, S., et al.: Modelling the role of agriculture for the 20th century global terrestrial carbon balance, Glob.l Change Biol., 13(3), 679–706, 2007. </reference>
		<reference numeration="4" content_type="text"> Breiman, L.: Bagging predictors, Mach. Learn., 24(2), 123–140, 1996. </reference>
		<reference numeration="5" content_type="text"> Breiman, L.: Random forests, Mach. Learn., 45(1), 5–32, 2001. </reference>
		<reference numeration="6" content_type="text"> Breiman, L., Friedman, J., Olshen, R., and Stone J.: Classification and Regression Tree, Wadsworth and Brooks, 1984. </reference>
		<reference numeration="7" content_type="text"> Brovkin, V., Sitch, S., von Bloh, W., Claussen, M., Bauer, E., and Cramer, W.: Role of land cover changes for atmospheric CO2 increase and climate change during the last 150 years, Glob. Change Biol., 10(8), 1253–1266, 2004. </reference>
		<reference numeration="8" content_type="text"> Burnham, K. P. and Anderson, D. R.: Multimodel inference – understanding AIC and BIC in model selection, Sociological Methods and Research, 33(2), 261–304, 2004. </reference>
		<reference numeration="9" content_type="text"> Chandra, D. K., Ravi, V., and Bose I.: Failure prediction of dotcom companies using hybrid intelligent techniques, Expert Syst. Appl., 36, 4830–4837, 2009. </reference>
		<reference numeration="10" content_type="text"> Dietterich, T. G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, Mach. Learn., 40(2), 139–157, 2000. </reference>
		<reference numeration="11" content_type="text"> Fader, M., Rost, S., and Müller, C.: Virtual water content of temperate cereals and maize: Present and potential future pattern, J. Hydrol., in review. </reference>
		<reference numeration="12" content_type="text"> Freund, Y. and Schapire,R. E.: Experiments with a new boosting algorithm, Proceedings of the 13th International Conference on Machine Learning, 148–156, 1996. </reference>
		<reference numeration="13" content_type="text"> Geurts, P., Ernst, D., and Wehenkel, L.: Extremely randomized trees, Mach. Learn., 63(1), 3–42, 2006. </reference>
		<reference numeration="14" content_type="text"> Hansen, L. and Salamon, P.: Neural network ensembles, IEEE Trans. Pattern Anal. Mach. Intell., 12, 993–1001, 1990. </reference>
		<reference numeration="15" content_type="text"> Haxeltine, A. and Prentice, I. C.: BIOME3: an equilibrium terrestrial biosphere model based on ecophysiological constraints, resource availability and competition among plant functional types, Global Biogeochem. Cy., 10, 693–710, 1996. </reference>
		<reference numeration="16" content_type="text"> Ho, T. K.: The random subspace method for constructing decision forests, Ieee Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844, 1998. </reference>
		<reference numeration="17" content_type="text"> Jones, C., Collins, M., Cox, P., and Spall, S. A.: The Carbon Cycle Response to ENSO: A Coupled Climate-Carbon Cycle Model Study, J. Climate, 14, 4113–4129, 2001. </reference>
		<reference numeration="18" content_type="text"> Jung, M., Verstraete, M., Gobron, N., Reichstein, M., Papale, D., Bondeau, A., Robustelli, M., and Pinty, B.: Diagnostic assessment of European gross primary production, Glob. Change Biol., 14(10), 2349–2364, 2008. </reference>
		<reference numeration="19" content_type="text"> Jung, M., Vetter, M., Herold, M., et al.: Uncertainties of modeling gross primary productivity over Europe: A systematic study on the effects of using different drivers and terrestrial biosphere models, Global Biogeochem. Cy., 21, GB4021, doi:10.1029/2006GB002915, 2007. </reference>
		<reference numeration="20" content_type="text"> Karalic, A.: Employing linear regression in regression tree leaves, Proceedings of the 10th European Conference on Artificial Intelligence, 440–441, 1992. </reference>
		<reference numeration="21" content_type="text"> Knorr, W., Gobron, N., Scholze, M., Kaminski, T., Schnur, R., and Pinty, B.: Impact of terrestrial biosphere carbon exchanges on the anomalous CO&lt;sub&gt;2&lt;/sub&gt; increase in 2002–2003, Geophys. Res. Lett., 34, L09703, doi:10.1029/2006GL029019, 2007. </reference>
		<reference numeration="22" content_type="text"> Kocev, D., Dzeroski, S., White, M. D., Newell, G., and Griffioen, P.: Using single- and multi-target regression trees and ensembles to model a compound index of vegetation condition, Ecol. Modell., 220, 1159–1168, 2009. </reference>
		<reference numeration="23" content_type="text"> Lasslop, G., Reichstein, M., Kattge, J., and Papale, D.: Influence of observation errors in eddy flux data on inverse model paramter estimation, Biogeosciences, 5, 1311–1324, 2008. </reference>
		<reference numeration="24" content_type="text"> Lasslop, G., Reichstein, M., Papale, D., Richardson, A. D., Arneth, A., Barr, A., Stoy, P., and Wohlfahrt, G.: Separation of net ecosystem exchange into assimilation and respiration using a light response curve approach: critical issues and global evaluation, Glob. Change Biol., doi:10.1111/j.1365-2486.2009.02041.x, in press, 2009. </reference>
		<reference numeration="25" content_type="text"> Liu, F. T., Ting, K. M., and Fan, W.: Maximizing Tree Diversity by Building Complete-Random Decision Trees, Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conference, PAKDD 2005, 2005. </reference>
		<reference numeration="26" content_type="text"> Liu, F. T., Ting, K. M., Yu, Y., and Zhou, Z. H.: Spectrum of variable-random trees, J. Artif. Intell. Res., 32, 355–384, 2008. </reference>
		<reference numeration="27" content_type="text"> Loh, W., Chen, C. W., and Zheng, W.: Extrapolation errors in linear model trees, ACM Trans. Knowl. Discov. Data, 1(2), 6, ISSN:1556-4681, 2007. </reference>
		<reference numeration="28" content_type="text"> Lucht, W., Prentice, I. C., Myneni, R. B., Sitch, S., Friedlingstein, P., Cramer, W., Bousquet, P., Buermann, W., and Smith, B.: Climatic control of the high-latitude vegetation greening trend and Pinatubo effect, Science, 296(5573), 1687–1689, 2002. </reference>
		<reference numeration="29" content_type="text"> Makridakis, S., Anderson, A., Carbone, R., Fildes, R., Hibdon, M., and Lewandowski, R.: The accuracy of extrapolation (time series) methods: Results of a forecasting competition, J. Forecast., 1, 111–153, 1982. </reference>
		<reference numeration="30" content_type="text"> Malerba, D., Esposito, F., Ceci, M., and Appice, A.: Top-Down Induction of Model Trees with Regression and Splitting Nodes, IEEE Trans. Pattern Anal. Mach. Intell., 26(5), 612–625, 2004. </reference>
		<reference numeration="31" content_type="text"> New, M., Lister,D. Hulme, M. and Makin I.: A high-resolution data set of surface climate over global land areas, Climate Res., 21, 1–25, 2002. </reference>
		<reference numeration="32" content_type="text"> Österle, H., Gerstengarbe, F.-W., and Werner, P. C.: Homogenisierung und Aktualisierung des Klimadatensdatzes der Climate Research Unit of East Anglia, Norwich, Terra Nostra, 6, 326–329, 2003. </reference>
		<reference numeration="33" content_type="text"> Papale, D. and Valentini, A.: A new assessment of European forests carbon exchanges by eddy fluxes and artificial neural network spatialization, Glob. Change Biol., 9(4), 525–535, 2003. </reference>
		<reference numeration="34" content_type="text"> Papale, D., Reichstein, M., Aubinet, M., Canfora, E., Bernhofer, C., Kutsch, W., Longdoz, B., Rambal, S., Valentini, R., Vesala, T., and Yakir, D.: Towards a standardized processing of Net Ecosystem Exchange measured with eddy covariance technique: algorithms and uncertainty estimation, Biogeosciences, 3, 571–583, 2006. </reference>
		<reference numeration="35" content_type="text"> Potts, D. and Sammut, C.: Incremental learning of linear model trees, Mach. Learn., 61(1–3), 5–48, 2005. </reference>
		<reference numeration="36" content_type="text"> Prentice, I. C., Cramer, W., Harrison, S. P., Leemans, R., Monserud, R. A., and Solomon, A. M.: A Global Biome Model Based on Plant Physiology and Dominance, Soil Properties and Climate, J. Biogeography, 19(2), 117–134, 1992. </reference>
		<reference numeration="37" content_type="text"> Qian, H., Joseph, R., and Zeng, N.: Response of the terrestrial carbon cycle to the El Nino-Southern Oscillation, Tellus B, 60(4), 537–550, 2008. </reference>
		<reference numeration="38" content_type="text"> Reichstein, M., Papale, D., Valentini, R., et al.: Determinants of terrestrial ecosystem carbon balance inferred from European eddy covariance flux sites, Geophys. Res. Lett., 34, L01402, doi:10.1029/2006GL027880, 2007. </reference>
		<reference numeration="39" content_type="text"> Reichstein, M., Falge, E., Baldocchi, D., et al.: On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm, Glob. Change Biol., 11(9), 1424–1439, 2005. </reference>
		<reference numeration="40" content_type="text"> Richardson, A. D., Hollinger, D. Y., Burba, G. G., et al.: A multi-site analysis of random error in tower-based measurements of carbon and energy fluxes, Agr. Forest. Meteorol., 136(1–2), 1–18, 2006. </reference>
		<reference numeration="41" content_type="text"> Schaphoff, S., Lucht, W., Gerten, D., Sitch, S., Cramer, W., and Prentice, I. C.: Terrestrial biosphere carbon storage under alternative climate projections, Climatic Change, 74(1–3), 97–122, 2006. </reference>
		<reference numeration="42" content_type="text"> Schwarz, G.: Estimating the dimension of a model, Ann. Stat., 6(2), 461–464, 1978. </reference>
		<reference numeration="43" content_type="text"> Sims, D. A., Rahman, A. F., Cordova, V. D., et al.: On the use of MODIS EVI to assess gross primary productivity of North American ecosystems, J. Geophys. Res.-Biogeosci., 111(G4), G04015, doi:10.1029/2006JG000162, 2006. </reference>
		<reference numeration="44" content_type="text"> Sitch, S., Brovkin, V., von Bloh, W., van Vuuren, D., Assessment, B., and Ganopolski, A.: Impacts of future land cover changes on atmospheric CO&lt;sub&gt;2&lt;/sub&gt; and climate, Global Biogeochem. Cy., 19(2), GB2013, doi:10.1029/2004GB002311, 2005. </reference>
		<reference numeration="45" content_type="text"> Sitch, S., Smith, B., Prentice, I. C., et al.: Evaluation of ecosystem dynamics, plant geography and terrestrial carbon cycling in the LPJ dynamic global vegetation model, Glob. Change Biol., 9(2), 161–185, 2003. </reference>
		<reference numeration="46" content_type="text"> Vens, C. and Blockeel, H.: A simple regression based heuristic for learning model trees, Intelligent Data Analysis, 10, 215–236, 2006. </reference>
		<reference numeration="47" content_type="text"> Vetter, M., Churkina, G., Jung, M., Reichstein, M., Zaehle, S., Bondeau, A., Chen, Y., Ciais, P., Feser, F., Freibauer, A., Geyer, R., Jones, C., Papale, D., Tenhunen, J., Tomelleri, E., Trusilova, K., Viovy, N., and Heimann, M.: Analyzing the causes and spatial pattern of the European 2003 carbon flux anomaly using seven models, Biogeosciences, 5, 561–583, 2008. </reference>
		<reference numeration="48" content_type="text"> Vogel, D. S., Asparouhov, O., and Scheffer, T.: Scalable look-ahead linear regression trees, in: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, edited by: ACM, San Jose, California, USA, 2007. </reference>
		<reference numeration="49" content_type="text"> Weber, U., Jung, M., Reichstein, M., Beer, C., Braakhekke, M. C., Lehsten, V., Ghent, D., Kaduk, J., Viovy, N., Ciais, P., Gobron, N., and Rödenbeck, C.: The interannual variability of Africa&apos;s ecosystem productivity: a multi-model analysis, Biogeosciences, 6, 285–295, 2009. </reference>
		<reference numeration="50" content_type="text"> Xiao, J. F., Zhuang, Q. L., Baldocchi, D. D., et al.: Estimation of net ecosystem carbon exchange for the conterminous United States by combining MODIS and AmeriFlux data, Agr. Forest Meteorol., 148(11), 1827–1847, 2008. </reference>
		<reference numeration="51" content_type="text"> Yang, L., Ichii ,K., White, M. A., Hashimoto, H., Michaelis, A., Votava, P., Zhu, A., Huete, A., Running, S., and Nemani, R.: Developing a continental-scale measure of gross primary production by combining MODIS and AmeriFlux data through Support Vector Machine Approach, Remote Sens. Environ., 110, 109–122, 2007. </reference>
	</references>
</article>

