Consistent retrieval of land surface radiation products from EO , including traceable uncertainty estimates

Earth observation (EO) land surface products have been demonstrated to provide a constraint on the terrestrial carbon cycle that is complementary to the record of atmospheric carbon dioxide. We present the Joint Research Centre Two-stream Inversion Package (JRC-TIP) for retrieval of variables characterising the state of the vegetation–soil system. The system provides a set of land surface variables that satisfy all requirements for assimilation into the land component of climate and numerical weather prediction models. Being based on a 1-D representation of the radiative transfer within the canopy–soil system, such as those used in the land surface components of advanced global models, the JRCTIP products are not only physically consistent internally, but they also achieve a high degree of consistency with these global models. Furthermore, the products are provided with full uncertainty information. We describe how these uncertainties are derived in a fully traceable manner without any hidden assumptions from the input observations, which are typically broadband white sky albedo products. Our discussion of the product uncertainty ranges, including the uncertainty reduction, highlights the central role of the leaf area index, which describes the density of the canopy. We explain the generation of products aggregated to coarser spatial resolution than that of the native albedo input and describe various approaches to the validation of JRC-TIP products, including the comparison against in situ observations. We present a JRC-TIP processing system that satisfies all operational requirements and explain how it delivers stable climate data records. Since many aspects of JRC-TIP are generic, the package can serve as an example of a state-of-the-art system for retrieval of EO products, and this contribution can help the user to understand advantages and limitations of such products.


Introduction
This special issue addresses the consistent assimilation of multiple data streams into biogeochemical models.Among the available data streams, long-term high-precision observations of the atmospheric carbon dioxide concentration (see e.g.Houweling et al., 2012) provide an indispensable constraint for the (process parameter) calibration of terrestrial biosphere models in carbon cycle data assimilation systems (CCDASs; Rayner et al., 2005).The strength of this constraint is quantified by significant reductions in uncertainty in simulated terrestrial carbon fluxes diagnosed over (Kaminski et al., 2002;Rayner et al., 2005) or predicted after (Scholze et al., 2007;Rayner et al., 2011) the assimilation window.In recent multi-data stream assimilation studies at global scale (Scholze et al., 2016;Schürmann et al., 2016), the constraint through the flask sampling network has proven essential to achieving realistic magnitudes of the terrestrial carbon sink.The flask sampling network alone does, however, only constrain a subspace of the space of unknown process parameters.Thus, additional, complementary constraints are required to further reduce uncertainties in the system.Such complementarity has been demonstrated for Earth observa-Published by Copernicus Publications on behalf of the European Geosciences Union.tion (EO) products (Gobron et al., 2007;Pinty et al., 2011b) of the fraction of absorbed photosynthetically active radiation (FAPAR), which provide information on the vegetation phenology and colour, for example.The effect on carbon and water fluxes of assimilating FAPAR in addition to atmospheric carbon dioxide samples is, for example, quantified by Kaminski et al. (2012) and Schürmann et al. (2016).
The assimilation of an EO product such as FAPAR requires a so-called observation operator.The task of an observation operator is to simulate the counterpart of an observation from the model's prognostic variables, i.e. the variables that the integration scheme of the model's dynamical equations steps forward in time (Kaminski and Mathieu, 2017).For a land product such as FAPAR, the construction of the observation operator requires solving the equations for the radiative transfer (RT) within the canopy-soil system.The RT within the canopy is complicated because the leaves, which scatter the solar radiation, are large (compared to the wavelength) and vary in their orientation and optical properties.For large-scale terrestrial models it is (at least computationally) infeasible to resolve the small-scale 3-D heterogeneity of the canopy.The most advanced RT representations in such models are 1-D approximations relying on so-called two-stream (or two-flux) approaches.
The retrieval of a set of EO products describing the evolution of the canopy-soil system, e.g.leaf area index (LAI) or FAPAR, also has to rely on a RT model, in EO terminology called forward model, to simulate the partitioning of the incoming solar radiation into contributions from the individual radiative fluxes, i.e. those absorbed in, transmitted through, and reflected by the canopy.In order to exploit the full potential of the retrieved variables, this forward model should be as close as possible to the RT model used in the observation operator for assimilation.The joint retrieval of a set of EO products with the same RT model is a prerequisite to ensure physical consistency (including conservation of energy) of the retrieved products.The use in a CCDAS requires the retrieval product to be provided with an (typically spaceand time-dependent) uncertainty estimate.For assimilation of multiple products from a joint retrieval, the correlation of their uncertainty is also required to allow the extraction of the true information content from the jointly retrieved products.The retrieved products must be quality assured, i.e. they need to be validated against independent information.Finally, the retrieval algorithm must be efficient enough to allow globalscale processing, preferably near real time.
The Joint Research Centre Two-stream Inversion Package (JRC-TIP; Pinty et al., 2007Pinty et al., , 2008) is a retrieval package that fulfils the conditions above.It is built around a twostream model (Pinty et al., 2006) of the RT in the canopysoil system (see Sect. 2) and applies a joint inversion (Tarantola, 2005) approach (see Sect. 3) that combines the information in observed radiative fluxes with prior information on the model parameters (see Sect. 4.1).Its products are posterior estimates of the model parameters and all radiant fluxes.The model parameters are the effective LAI and the spectrally variant variables background reflectance and effective canopy reflectance and transmittance (where effective indicates model-dependence, see Sect. 2).The radiant fluxes include (but are not limited to) model counterparts to the ones that have been observed.The retrieved products are available with uncertainty estimates and their covariance (sometimes termed error covariance).The package is highly flexible: it can be operated for any combination of narrowband, broadband, or hyperspectral radiation flux observations (Lavergne et al., 2006) and on all spatial scales above 100 m (when lateral flux components can safely be neglected) even for high canopies.The radiative flux that is accessible to observations from space is the reflected sunlight, i.e. the albedo, once a complex series of procedures to remove atmospheric effects has been applied together with performing the required integration over exiting and/or sun illumination angles.Hence, for EO applications JRC-TIP is typically set up to use observed albedo as input.Healthy green vegetation is characterised by a strong albedo difference between the visible (VIS) and near-infrared (NIR) domains of the spectrum.Accordingly, the system is typically operated on albedo input in these two wavebands.In this configuration it has been applied to broadband albedos derived from MODIS (Pinty et al., 2007(Pinty et al., , 2008(Pinty et al., , 2011a, b), b), MISR (Pinty et al., 2007(Pinty et al., , 2008)), and GlobAlbedo (Disney et al., 2016).Section 4 describes enhancements of robustness and efficiency through the use of so-called TIP tables, i.e. look-up tables (LUTs) of quality-controlled retrievals over a fine discretisation of the input space (Clerici et al., 2010;Voßbeck et al., 2010).Section 4 discusses products from a large-scale processing exercise (Pinty et al., 2011a, b) based on MODIS collection 5 broadband albedo input, with a focus on the reported uncertainty estimates.Validation of JRC-TIP products is described in Sect. 5.

Radiative transfer model
The two-stream model at the core of JRC-TIP is described in full detail by Pinty et al. (2006).Therefore, we restrict ourselves to a brief summary of the main features.The model is designed to solve the radiation balance for the canopy-soil system (see Fig. 1).It simulates the solar radiant fluxes scattered by, transmitted through, and absorbed in a vegetation canopy that is composed of so-called bi-Lambertian leaves (i.e. the radiation scattered from and transmitted through the leaves -featured as flat disks -does not have any angular dependency around the leaf normal vectors), possibly exhibiting a preferred orientation.The bi-Lambertian leaf scattering property is such that the fraction of radiation that is not absorbed is scattered as a cosine distribution around the leaf normal vectors.The top and bottom boundary conditions are specified by the downwelling of direct and diffuse radiant fluxes and the albedo of the background, respectively.This model is constructed from dedicated solutions to three separate problems involving (1) the scattering by the vegetation layer only, identified as the black background contribution; (2) the flux transmitted directly through the vegetation layer involving only the background, that is the black canopy contribution; and finally (3) the contribution to the upward-and downward-scattered and -transmitted fluxes involving multiple interactions between both the vegetation layer and its underlying background (see Fig. 2).
The solution to the black background contribution follows the formulation originally established by Meador and Weaver (1980).It ensures the correct balance between the scattered, transmitted, and absorbed radiant fluxes not only for structurally homogeneous but also for heterogeneous canopies.The applicability to heterogeneous canopies relies on the finding that a solution to a 3-D flux problem satisfying the conditions imposed by a "radiatively independent volume" can always be achieved using a 1-D representation (Pinty et al., 2004, Sect. 3.3).The model's canopy state variables required for the correct flux representation are, however, socalled effective variables.They deviate from the true canopy variables and are thus only meaningful in the context of this model.
These effective variables are a spectrally invariant quantity, namely the LAI and, spectrally dependent parameters.The latter are the leaf single scattering albedo w l = r l +t l and the ratio d l = r l /t l (identified here as the asymmetry factor).r l and t l correspond to the leaf reflectance and transmittance, respectively.The albedo of the background, r g , is itself defined as the true (by contrast to effective) value and retrieved as such.And clearly, for all fluxes true values are simulated.
The possibility to use a (1-D) two-stream representation to solve a flux problem irrespective of the 3-D complexity of the scene conditions means that the model can be operated in inverse mode to retrieve a set of state variables for the canopysoil system that allows an accurate flux representation.The model is implemented in numerically efficient, modular, and portable form to simplify its integration into climate and numerical weather prediction (NWP) models.Meador and Weaver (1980) applies to the black background contribution (left-hand side) only.
3 Inverse model JRC-TIP applies the joint inversion approach of Tarantola (2005) (discussed as Bayesian inversion by Rayner et al., 2016 in this special issue): it estimates the state vector (also called parameter vector in the following) from a given set of observations and the available prior information.The a priori state of information is quantified by a probability density function (PDF) in parameter space, the observational information by a PDF in observation space, and the information from the model by a PDF in the joint space, i.e. the Cartesian product of parameter and observation spaces.The inversion combines all three sources of information and yields a posterior PDF in the joint space.
Prior and observational PDFs are difficult to specify.We use Gaussian shapes with respective mean values denoted by x 0 and d and respective covariance matrices denoted by C(x 0 ) (prior parameter uncertainty) and C(d) (data uncertainty).The data uncertainty is the sum of uncertainties due to errors in the observational process, C(d obs ), and errors in our ability to correctly model the observations, C(d mod ): (1) Some observational products provide uncertainty ranges and their correlation, i.e. the entire C(d obs ).If this is not the case, we often assume uncorrelated uncertainties, i.e. zero off-diagonal elements.The diagonals are populated with the squares (i.e.variances) of the 1σ uncertainty ranges, for which we typically proceed as follows: in C(d obs ) we often use values proportional to d with a floor value.Since the value in C(d obs ) typically considerably exceeds that in C(d mod ) (see Sect. 5), we neglect the latter.The exception is for small values of d, where the floor value is supposed to represent C(d mod ).Note that, in the typical set-up, with d being broadband albedo products, there is no additional contribution from representation error (see e.g.Heimann and Kaminski, 1999;Kaminski et al., 2010), as the model and the observations are defined on the same space-time grid.
For later use it is convenient to have two separate notations for the model simulation of a flux vector from a given state vector x.For simulation of the full vector of all flux components y, we use N and when the flux vector is restricted to those components for which we have observations, y obs , we www.biogeosciences.net/14/2527/2017/Biogeosciences, 14, 2527-2541, 2017 T. Kaminski et al.: JRC-TIP use M, i.e. (3) The inverse model is flexible with respect to the number and width of spectral bands that are simulated and the subset of simulated fluxes, y obs , that are observed.Every combination is feasible; Lavergne et al. (2006) provide examples.
Since the model is only weakly non-linear, we can approximate the posterior PDF by a Gaussian PDF.The corresponding marginal PDF in parameter space is thus also Gaussian, with mean value x and covariance C(x) −1 .The mean x is approximated by the maximum likelihood point, i.e. the minimum of the misfit function: C(x) is approximated by the inverse of the misfit function's Hessian, H , evaluated at x: (5) To understand this relation it is instructive to look at the case of a linear model (denoted by M ), for which differentiating J (x) (Eq.4) twice yields The Hessian is the sum of two terms, one reflecting the strength of the constraint by the prior information, and the other reflecting the observational constraint.Typically adding the observational constraint increases the curvature of the cost function, which via Eq.( 6) translates to a reduction in uncertainty compared to the prior.One of the uncommon counterexamples for the reduction in uncertainty is provided by Lavergne et al. (2006).
From the optimal parameter set we can simulate (see Eq. 3) all radiant fluxes (including the unobserved ones).To assess the strength of the observational constraint on a simulated radiant flux, we use N , the first derivative of N to propagate the posterior parameter uncertainties forward the uncertainty in simulated vector of radiant fluxes C(y): Equation ( 7) is particularly useful for comparing the TIP results with independent observations.Evaluating Eq. ( 7) for the prior uncertainty C(x 0 ) instead of the posterior uncertainty C(x), i.e. for a case without observational constraint, yields a prior uncertainty for the flux: For any component of the flux vector we can quantify the added value of the observations (or the observation impact) by the uncertainty reduction (later also termed knowledge gain or k gain) relative to the prior.
where σ (y i ) and σ (y i,0 ) respectively denote the 1σ uncertainty ranges, the squares of which populate the diagonals of C(y) and C(y 0 ).For example, if σ (y i ) is 90 % of σ (y i,0 ), then the uncertainty reduction is 10 %; i.e. we have increased our knowledge of y by 10 %.
The simultaneous retrieval of all state variables and the associated fluxes within a single model ensures physical consistency between the derived products.This includes simulated counterparts, y obs , of the observed flux components.
This inversion approach is relatively generic, i.e. it similarly applies to further RT models in the optical domain (see e.g.Lavergne et al., 2007;Lewis et al., 2012) or other spectral domains, e.g. the passive microwave domain (see also Kaminski and Mathieu, 2017).
Equation ( 4) is minimised by a so-called gradient algorithm that relies on code for evaluation of J and its gradient.Further derivative code is used to evaluate Eqs. ( 5) and ( 7).
4 Operational processing

Prior information
The radiative flux component that is accessible to observations from space is the reflected flux (albedo).As photosynthesis is driven by absorption in the VIS, our focus is on the flux partitioning in this domain of the spectrum.Pinty et al. (2009) demonstrated that under typical non-snow conditions and with known optical properties at leaf level, the background reflectance largely determines the albedo in the VIS and the effective LAI determines the albedo in the NIR.Hence, it is favourable to operate JRC-TIP in both VIS and NIR, with albedo observations in these two broadbands.Including the NIR brings in one additional observational constraint but also adds three spectrally variant state variables to the inverse problem.This is partly compensated for by (approximately) known relations of the background reflectance across the VIS and NIR domains.This relation translates to our inversion formalism as a correlated uncertainty and is visualised by the ellipsoids (indicating the 1.5σ uncertainty ranges) shown in Fig. 3.We note that this relation, known as the soil line (see e.g.Chi, 2003), changes with the occurrence of snow. Figure 3 further shows the components of the prior PDF, describing the effective canopy single scattering albedo, which is based on a combination of modelled (Jacquemoud and Baret, 1990) and observed (Hosgood et al., 1995) leaf optical properties.They were further modified to best account for the overall effects on the domain-averaged radiant fluxes of needle clumping into shoots, shoots or leaves clumping into crowns, and the pres- ence of woody elements in the canopy (see Pinty et al., 2008, for details).
This ellipsoid describes the 1.5 uncertainty range of our standard prior, denoted as polychromatic leaf scenario.We also operate the system with an alternative prior, denoted as green leaf scenario (not shown in Fig. 3), which is characterised by a much lower uncertainty and a slight shift in the mean value.Both scenarios are operated with snow and nonsnow priors for the background reflectance, depending on additional snow information (both shown in Fig. 3).A summary of the prior information is provided in Table 1.We stress the high prior variance of 25 for the effective LAI, a deliberately conservative assumption that results in a low weight on the prior term in Eq. (4).

Observations
The specific quantities to be discussed in Sect.4.4 have been retrieved (as described by Pinty et al., 2011a) from the MODIS collection V005 (MCD43B3) broadband white sky albedo (WSA) products at 1 km resolution (Schaaf et al., 2002).The WSA product uses a synthesis period of 16 days, in which observed reflectances under various illumination angles are used to calculate the spherical integral (isotropic illumination).The albedo product provides a data set every 8 days such that filtering out every second data set yields a sequence of data sets, in which each member is based on its individual 16-day synthesis period.This procedure maximises the temporal independence of the observational input for JRC-TIP.The MODIS collection V005 WSA product provides a quality flag associated with all spectral bands but no covariance of uncertainty.As described in Sect.3, we populate the non-diagonal elements of C(d) with 0. For the di-Table 1. Mean values x 0 and associated standard deviations σ (x 0 ) used to set the diagonal of the prior uncertainty covariance matrix C(x 0 ).λ 1 and λ 2 correspond to the broadband visible (0.3-0.7 µm) and near-infrared (0.7-3.0 µm) spectral domains, respectively.ω l ( λ 1,2 ), d l ( λ 1,2 ), and r g ( λ 1,2 ) refer to the effective canopy single scattering albedo, asymmetry factor, and background albedo, respectively.
b Value adopted for the bare soil case with a correlation between the two spectral domains of 0.8862 set in C(x 0 ).c Value adopted under occurrence of snow with a correlation between the two spectral domains of 0.8670 set in C(x 0 ).
agonal elements the quality flag "good quality" ("other quality") is mapped onto a 1σ uncertainty range of 5 % (7 %) relative to the flux and a floor value of 2.5×10 −3 is set.All other observations are discarded.In addition, the MODIS snow indicator is used to trigger a switch of the prior for the background reflectance from the non-snow to the snow version (see Sect. 4.1).

Robustness and efficiency
In the set-up described above, all observations are restricted to broadband VIS/NIR albedo pairs, which can theoretically take values in the 2-D domain [0, 1]x[0, 1] (albedo plane).
Observations retained for processing with JRC-TIP fall either in the 5 or the 7 % uncertainty case.For both uncertainty cases we now apply JRC-TIP over a discretisation of the albedo plane with a step size of 10 −3 (i.e. a factor of 2.5 below the minimum uncertainty) on both axes (Clerici et al., 2010;Voßbeck et al., 2010).This provides us with a set of 2×10 3 ×10 3 = 2 million JRC-TIP retrievals, which populate the 2-D space of theoretically possible albedo input observations in a form dense enough for all practical purposes.We note that in practice only a subdomain of the albedo plane is covered by observations.Figure 4 shows the location in the albedo plane of all albedo pairs used by Pinty et al. (2011a) for their processing of the year 2005, excluding those with a snow flag.Switching from the non-snow to the snow prior adds a factor of 2, i.e. there are in total 4 million retrievals for the polychromatic leaf scenario.
www.biogeosciences.net/14/2527/2017/Biogeosciences, 14, 2527-2541, 2017 We denote the set of retrievals described above as the TIP table.Once the TIP table is generated, a retrieval for any given albedo input pair can be performed through a look up in the TIP table.We stress that the role of the TIP table is different from the traditional use of LUTs in retrieval schemes: while traditional LUTs relate input to output of the forward model (i.e.state variables to albedos), the TIP table relates input to output of the inverse model JRC-TIP (i.e.albedos to the complete set of variables retrieved by the JRC-TIP, including uncertainty ranges and auxiliary information).The use of the TIP table in a processing system has four advantages over the use of a standard JRC-TIP retrieval.
1. First, it is faster.For example, processing of one global WSA albedo input field (covering a 16-day period) in 500 m spatial resolution takes less than 2 h on the single core of a standard desktop computer equipped with an Intel Core-i5 CPU running at a maximum frequency of 3.3 GHz.The value of 2 h covers the full input and output (including generation of output files) and yields an average processing time per pixel of less than 0.004 ms.
2. Second, it simplifies quality control.Clerici et al. (2010) and Voßbeck et al. (2010) describe a number of iterative procedures to enhance the quality of the retrievals in the TIP table.They exploit, for example, the requirement of a smooth dependence of the solution on the input albedos to detect outliers.
3. Third, the TIP table approach ensures stability of any Climate Data Record (CDR) that is generated from a stable albedo CDR.By construction, JRC-TIP will always retrieve the same values for all variables and uncertainties from the same albedo input with the same uncertainty range.For the standard JRC-TIP retrieval, this is only guaranteed when the computing environment (e.g.platform, compiler, compiler flags, operating system, and required libraries) remains unchanged.
4. Fourth, a processing system relying on the TIP table is agile.When a component of the JRC-TIP retrieval procedure is improved, the only change required in the processing system is an update of the TIP table.
For albedo input products that provide per-pixel uncertainty ranges, the TIP table uses a finer discretisation and further dimensions in the uncertainty space, but the same general approach applies.For example, Disney et al. (2016) use a 2-D uncertainty space: one dimension each for uncertainties in VIS and NIR.An extra dimension for their correlation was not included.

Understanding uncertainty
We analyse the JRC-TIP products over the range of the albedo input plane that is actually covered by observations, more specifically the range covered by the MODIS collection 5 albedo 1 km input products for the year 2005 that were processed by Pinty et al. (2011a, b).We focus on snow-free background conditions, i.e. all prior values and uncertainties are spatially invariant.We show for effective LAI and background reflectance in the VIS (Fig. 4) as well as for effective canopy single scattering albedo in the VIS and FAPAR (Fig. 5) the retrieved mean values (top panels) and 1σ uncertainty ranges (middle panels) as well as uncertainty reduction (knowledge gain) as defined by Eq. ( 9) and the value of the misfit function (bottom panels) over the albedo plane.The first point to note is the limited subset of the albedo plane that is covered by actual albedo observations.A further point to highlight is the fundamental role of the effective LAI: high effective LAI values correspond to relatively high posterior LAI uncertainty and little knowledge gain because the dense canopy can only be penetrated to a limited extent.For the same reason, we can infer little information on the background under dense canopies, i.e. there is a high posterior uncertainty and little knowledge gain.In contrast, given the large amount of canopy material, we can substantially reduce the uncertainty in the single scattering albedo, i.e. we have a large knowledge gain for this variable.Low effective LAI characterises an almost transparent canopy: uncertainty on LAI and background reflectance is low and there is high knowledge gain from observations.The low amount of canopy material limits the knowledge gain for the single scattering albedo, i.e. we are left with relatively high uncertainty.In this regime the observed albedo is determined by the background reflectance (shown for the visible domain in Fig. 4b).The pattern of the mean value for FAPAR is similar to that for LAI.The uncertainty is, however, different.While the LAI uncertainty grows steadily with LAI itself, the FA-PAR uncertainty exhibits two separated domains with high uncertainty.On the line of constant WSA NIR, one peak is located at WSA VIS around 0.03 and the other peak around 0.13.As pointed out by Pinty et al. (2011b), this reflects the influence of the soil background, which for LAI values in the range from 0.3 to 0.5 exhibits an equally complex uncertainty structure (Fig. 4d).In the minimum of the misfit function J of Eq. ( 4) that is displayed in Fig. 5f, we can clearly distinguish the two regimes.In the regime of low LAI (located on the soil line), the observations are primarily fit by variation of the background reflectance, and the misfit primarily reflects the deviation from the prior background reflectance shown in Fig. 3. Owing to the large prior uncertainty of 5, the contribution of the effective LAI to the prior term of the misfit function is very small, and the remaining canopy variables remain close to the prior.In contrast, in the regime of high LAI, the observations are primarily fit through variation of the canopy variables, and the background reflectance is close to the prior.

Aggregation
For data assimilation on a global scale, products are required in lower spatial resolution than the native resolution of the albedo input product (which is currently in the order of 1 km).JRC-TIP retrieval and aggregation are not commutative, i.e. it makes a difference whether we aggregate JRC-TIP products generated at the native resolution of the albedo product or operate JRC-TIP on the aggregated albedo input.This is illustrated in Fig. 6, which shows retrieved LAI and FAPAR over the albedo plane (with 5 % uncertainty) and indicates two albedo pairs by black points.If we assume the two albedo pairs describe observations over neighbouring pixels of the same area, running JRC-TIP on the average albedo yields LAI and FAPAR values of 0.194 and 0.153, respectively.In contrast, the average LAI and FAPAR for both pixels are 0.532 and 0.275, respectively.Also, when aggregating at the level of JRC-TIP products, it is not guaranteed that the aggregated state variables are still consistent with the aggregated fluxes.Only by operating the JRC-TIP on the aggregated albedo input, can we ensure the physical consistency among the JRC-TIP products and with the 1-D representation of the radiation transfer process in the climate or NWP model.Another point is that the aggregation also needs to be performed on the uncertainty.This requires a specification of spatial uncertainty correlation and is certainly less complicated at the albedo level than at the level of JRC-TIP products.The effect of changing the order of ag-gregation is quantified by Fig. 7, which shows on the global scale and for a northern winter period in 2005 the relative differences of FAPAR products derived by either aggregation order.The pattern of the differences suggests overestimation of FAPAR aggregated on the level of JRC-TIP products over boreal regions and an underestimation over transitional regions.One reason may be that running JRC-TIP on the aggregated albedo product does properly account for Figure 6.Effective LAI (a) and FAPAR (b) over albedo spectral plane retrieved using no-snow prior parameters (Table 1) and for albedo uncertainty of 5 % in both wavebands.Retrievals yielding a misfit function value J above 3 have been masked out.
snow soil conditions at pixel level.Further analyses of underlying mechanisms are foreseen in a future study.

Validation
The validation of the JRC-TIP and its generated products is achieved through a number of complementary stages.The first one consists of assessing the performance of the direct model, namely the two-stream model that is further used in inverse mode to generate the JRC-TIP products.This performance can be thoroughly benchmarked against comprehensive 3-D Monte Carlo models for a series of virtual canopies exhibiting different levels of complexity regarding the radiation transfer regime that these canopies can represent (see Sect. 3 of Pinty et al., 2006).The RAdiation transfer Model Intercomparison (RAMI) initiative (http://rami-benchmark. jrc.ec.europa.eu)offers such a platform for a range of simple and very complex canopy scenarios (Pinty et al., 2001;Widlowski et al., 2007).The 1-D model implemented in JRC-TIP was found to be in very good agreement, i.e. better than 3 % in most cases, with albedos from accurate and realistic simulations of complex 3-D scenarios in both the red and nearinfrared spectral regions.
While this first set of RAMI exercises addressed the accuracy of simulated albedo, i.e.C(d mod ) in Eq. ( 1), a further exercise in the RAMI frame (termed RAMI4PILPS) addressed the accuracy and consistency of the absorbed, reflected, and transmitted radiative fluxes retrieved by inverse models of the soil-vegetation-atmosphere transfer (Widlowski et al., 2011).This exercise thus offers a possibility to assess the performance of the JRC-TIP with regard to its ability to partition the incoming solar radiation.This exercise is conducted for the extreme conditions of computer-reconstructed actual canopy scenarios, with a range of sun zenith angle and vegetation background, including snow-covered conditions.For JRC-TIP, the vast majority of the absorbed flux values (i.e.FAPAR) fall within ±10 % relative to those values estimated by the reference Monte Carlo model (see Sect. 3 of Widlowski et al., 2011).
The capability of the JRC-TIP to reconstruct solar fluxes that can be currently measured in situ by dedicated instruments, e.g.direct or diffuse canopy albedos and transmission, offers a definite solution to assess the performance of the procedure.However, the crux of the matter with such an exercise lies in the large spatial variability of the canopy at various scales such that the spatial and temporal sampling of a given site must be achieved carefully and quite extensively.A first attempt to evaluate the JRC-TIP products generated from MODIS white sky albedo input values over a FLUXNET site is described in Pinty et al. (2011c).In this study the authors have capitalised on an ensemble of LAI-2000 measurements systematically acquired over multiple years along a 400 m transect as well as series of photos taken from a tower emerging from the top of this deciduous mid-latitude forest.Such measurements from LAI-2000 correspond to estimates of the fraction of the radiation that is transmitted through the vegetation canopy layers.When considering the direct radiation -which thus has not collided with the vegetation elementsthe transmitted fraction, T UnColl , can be expressed with the classical Beer-Bouguer-Lambert law, where the exponential attenuation is a function of the effective LAI, LAI: where µ 0 denotes the cosine of sun zenith angle, i.e. µ 0 = 1 when the sun is at nadir (Pinty et al., 2006(Pinty et al., , 2009)).Figure 8 shows these observations together with the direct transmission derived by JRC-TIP from MODIS collection 5 broadband WSA products at 500 m and 1 km resolution.Grey and blue shaded ranges indicate the spatial variability along the transect at which the observations were collected, and the In general, the results show good consistency between the JRC-TIP products and this ensemble of information, given that the MODIS sub-pixel variability corresponds to a range of values that are analogous to the uncertainties associated with the JRC-TIP retrievals.For a single period (from the middle to the end of January) the direct transmission derived by JRC-TIP from both products is completely outside the observed range.For the 500 m resolution, we trace this back to the input albedos (shown in Fig. 9a), where the VIS albedo shows an abnormal increase for this particular period, which is very likely due to snowy background conditions that remained undetected in the MODIS product, i.e. the snow flag was not raised.The inversion procedure, being operated with non-snow priors in this case, needs to minimise the misfit function J (see Eq. 4), which quantifies the misfit between modelled and observed albedos and the deviation of the parameters from their priors.In order to best fit this high observed albedo in the VIS without being penalised by a high prior term in J , the minimisation procedure increases the background reflectance in the VIS (Fig. 9c) and turns off the vegetation contribution by setting LAI close to zero (Fig. 9b).This explains the high direct transmission derived for this period (Fig. 8) and also means that there is no absorption of the incoming radiant flux by the vegetation (Fig. 9e and f).For the time period in question, the graphs also include, in the colour magenta, a second retrieval with snow prior.The corresponding LAI, the transmission in the VIS, and the absorption in the VIS and the NIR are then much closer to the values for the preceding and succeeding periods, and the background reflectance is closer to the soil line for snow.We note that our global-scale processing set-up scans non-snow retrievals using several conditions for outliers, which may then be corrected by a snow retrieval.
While proposing a simple protocol to validate the JRC-TIP products against in situ data, Pinty et al. (2011c) also highlighted the lack of critical, although not challenging, measurements of, for instance, the background albedo and its spatio-temporal variability at site level.This is a typical but very unfortunate situation, as the combination of the direct transmission (i.e.effective LAI) and the background reflectance largely determines the partitioning of the incoming flux between the canopy and the soil.So far it has been very challenging to identify other such sites where compa- to 2008 along a 400 m transect using the so-called random forest algorithm (RFA).Panel (a) also shows, in the colour magenta, a second retrieval for the period from the middle to the end of January using the snow prior.
rable data sets acquired in situ over time are available for in-depth validation exercises.
Another element of the validation strategy consists of the comparison of JRC-TIP products derived over the same location from multiple albedo input products.For example, Pinty et al. (2007) analyse differences between JRC-TIP products derived from MODIS and MISR broadband WSAs.Pinty et al. (2008) apply the same strategy but to high-and midlatitude sites, as their focus lies on the behaviour of JRC-TIP products in the presence of snow.MISR is an instrument flying on the same platform (called Terra) as one of the MODIS instruments, and the procedure for deriving a WSA comparable to the standard MODIS product is described by Pinty et al. (2007).As MISR observes each pixel from several angles, it can collect a high number of samples and thus provide a good angular integral of reflectance, i.e. a good WSA reference.
Another level of validation is the comparison of JRC-TIP products against products derived with alternative retrieval approaches.An example is presented by Disney et al. (2016), who compare effective LAI and FAPAR products derived by JRC-TIP with the operational MODIS LAI and FAPAR products (Knyazikhin et al., 1998) at site, regional, and hemispheric scales.
A final level of validation is implicitly performed by the product users in their respective applications.Such applications include analyses of the consistency of the long-term CDR, and its inter-annual variability as demonstrated for FA-PAR by Gobron (2015).Sippel et al. (2016) use a deviation of the 2012 spring and late summer FAPAR from the respective long-term means to analyse the effect of a drought event on vegetation activity over North America and explain the response mechanism of the carbon balance as inferred from other data streams (Wolf et al., 2016).A consistency check against other data streams and a process model is provided by simultaneous assimilation of the FAPAR product with further data streams, in particular the atmospheric carbon dioxide record (Kaminski et al., 2013;Schürmann et al., 2016).Consistency to further data streams is also implicitly checked in diagnostic model set-ups, for example when the FAPAR product is used as a forcing field for simulation of photosynthesis (Kaminski et al., 2017).

Conclusions
The JRC-TIP is a highly flexible retrieval system that delivers a set of radiatively consistent land surface products.These products include all radiant fluxes (absorbed, transmitted, and reflected) and the complete set of state variables that parameterise the two-stream model at its core.This two-stream model provides a 1-D approximation of the radiative transfer within the canopy soil system, typically implemented in advanced land components of climate models.This renders the retrieved (model-dependent) state variables (such as the effective LAI) as compliant as possible to climate model applications (climate model compliance).The retrieved fluxes have a clear physical definition and are thus model independent.Hence, among the JRC-TIP products the fluxes are particularly suitable for assimilation into terrestrial models.Even in this case it is, nevertheless, crucial to have in the terrestrial model an observational operator that provides a www.biogeosciences.net/14/2527/2017/Biogeosciences, 14, 2527-2541, 2017 correct mapping from the state variables onto the simulated counterpart of the flux component that is being assimilated.All JRC-TIP products include estimates of uncertainty including their covariance that are consistently derived in a fully traceable manner through rigorous uncertainty propagation from prior and observational information in a two-step procedure.The first step derives uncertainty estimates for the state variables, and the second step maps these uncertainty estimates forward to the simulated fluxes.
For global-scale processing, JRC-TIP is operated on broadband albedo products (including snow information) derived from EO with a space-and time-invariant prior (except in the event of snow) such that the retrieved products are exclusively based on the EO input.Owing to this low-dimensional space of observational input, an operational system can be set up to retrieve products from a database (TIP table) of precalculated quality-controlled JRC-TIP solutions (including full uncertainty quantification).Such a system is computationally extremely efficient, robust, and agile.By construction it generates temporally stable climate data records from any albedo input record that fulfils this condition.
JRC-TIP products are typically provided in the native resolution of the albedo input product, i.e. on grids that are much finer (e.g. a few 100 to a few 1000 m) than typical resolutions of continental-to global-scale terrestrial models.To ensure their radiative consistency and climate model compliance, products on grids coarser than this native resolution have to be derived by first aggregating the albedo input and then applying JRC-TIP.
The JRC-TIP methodology is to a large extent generic (see e.g.Kaminski and Mathieu, 2017) and can be generalised to further RT schemes.This holds in particular for the two-step procedure that first solves for the state variables and from there simulates a set of target quantities, both steps including uncertainty propagation.The application of a solution database requires a bounded, low-dimensional space of observational inputs.
Code and data availability.C and Fortran implementations of the two-stream code are available at http://fapar.jrc.ec.europa.eu.The JRC-TIP product based on MODIS collection 5 in 1 km resolution is available upon request to the corresponding author.The MODIS products were obtained from the online Data Pool, courtesy of the NASA Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, https://lpdaac.usgs.gov/data_access/data_pool (LP DAAC, 2017).
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Schematic partitioning of the incoming solar radiation in the canopy-soil system.

Figure 2 .
Figure 2. Decomposition of the total flux into three contributing fractions.The two-stream solution byMeador and Weaver (1980) applies to the black background contribution (left-hand side) only.

Figure 3 .
Figure 3. Prior 1.5σ uncertainty for effective canopy single scattering albedo (denoted by leaves) and background reflectance (denoted by snow in case of snow and by soils otherwise).Modified version of Fig. 1 from Pinty et al. (2008).

Figure 4 .
Figure 4. Mean value (upper panel), uncertainty (middle panel), and uncertainty reduction, denoted here as k gain for knowledge gain (bottom panel) of effective LAI (left) and background reflectance in the VIS (right).

Figure 5 .
Figure 5. Mean value (upper panel), uncertainty (middle panel) of effective canopy single scattering albedo (left), and FAPAR (right).Uncertainty reduction (denoted here as k gain for knowledge gain) of effective canopy single scattering albedo (bottom left) and misfit function (see Eq. 4) at minimum (bottom right).

Figure 7 .
Figure 7. Relative differences of JRC-TIP FAPAR differently aggregated to 0.5 • .Wrong denotes aggregation of the JRC-TIP FAPAR generated at 0.01 • , whereas correct denotes aggregation of the input albedo products and subsequent application of JRC-TIP.

Figure 8 .
Figure 8.Comparison of the fraction of solar radiation transmitted to the background as derived by JRC-TIP applied to MODIS collection 5 input albedos and as observed in situ over the Hainich deciduous forest site following the approach by Pinty et al. (2011c).Panel (a) is based on 500 m MCD43 input albedos and panel (b) is based on the MCD43 1 km albedo product.The shaded zones indicate the range (grey) and interquartile (blue) range both estimated from the collected in situ measurements (using a LAI-2000 Plant Canopy Analyzer) from 2002to 2008 along a 400 m transect using the so-called random forest algorithm (RFA).Panel (a) also shows, in the colour magenta, a second retrieval for the period from the middle to the end of January using the snow prior.

Figure 9 .
Figure 9. MCD43 500 m input albedos over the pixel covering the 400 m transect in the visible and near-infrared domains (a), retrieved effective LAI (b), background reflectance VIS (c) and NIR (d), and fluxes absorbed in the vegetation in VIS (e) and NIR (f).Additional snow prior retrieval in panels (b-f) in magenta.