Explorer Integrated radar and lidar analysis reveals extensive loss of remaining intact forest on Sumatra 2007–2010

. Forests with high above-ground biomass (AGB), including those growing on peat swamps, have historically not been thought suitable for biomass mapping and change detection using synthetic aperture radar (SAR). However, by integrating L-band ( λ = 0.23 m) SAR from the ALOS and lidar from the ICESat Earth-Observing satellites with 56 ﬁeld plots, we were able to create a forest biomass and change map for a 10.7 Mha section of eastern Sumatra that still contains high AGB peat swamp forest. Using a time series of SAR data we estimated changes in both forest area and AGB. We estimate that there was 274 ± 68 Tg AGB remaining in natural forest ( ≥ 20 m height) in the study area in 2007, with this stock reducing by approximately 11.4 % over the subsequent 3 years. A total of 137.4 kha of the study area was deforested between 2007 and 2010, an average rate of 3.8 % yr − 1 . The different initial biomass values deforestation projects optical-based remote sensing. Furthermore, given SAR’s ability to penetrate the smoke and cloud which normally obscure land cover change in this region, SAR-based forest monitoring can be relied on to provide fre-quent imagery. This study demonstrates that, even at L-band, which typically saturates at medium biomass levels (ca. 150 Mg ha − 1 ), in conjunction with lidar data, it is possible to make reliable estimates of not just the area but also the carbon emissions resulting from land use change.


Introduction
Tropical forests provide multiple ecosystem services such as climate regulation and water filtration (Naidoo et al., 2008). However, markets fail to value forests and their services fully, with multiple direct and indirect processes driving extensive deforestation (complete removal of tree cover) and forest degradation (removal of a proportion of forest biomass), considered together as "DD" Bulte and Engel, 2006. DD in developing countries accounts for between 7 and 20 % of anthropogenic CO 2 emissions, e.g. 18 % (Grace et al., 2014), 15 % with a range of 8-20 % (van der Werf et al., 2009), and 7-14 % (Harris, 2012). Ultimately this is leading to between 0.9 and 2.2 Pg C yr −1 being transferred to the atmosphere (Houghton, 2010). By contrast, the release of carbon dioxide from fossil fuel burning in the tropics is just 0.74 Pg C yr −1 (Boden et al., 2010), so DD dominates anthropogenic CO 2 emissions from tropical countries. Furthermore, there is extensive evidence that intact and secondary forests in the tropics are acting as a significant carbon sink, absorbing at least as much carbon dioxide as is released through tropical deforestation Grace et al., 2014). Preventing dangerous climate change will therefore be much more difficult if tropical deforestation is not reduced or reversed. DD is also leading to extensive losses of biodiversity and other ecosystem services (Koh and Sodhi, 2010). Hence there are multiple environmental benefits to be achieved by slowing or reversing these processes. Consequently, in the private sector, investors and consumers are pressuring companies trading in commodities like soy, palm oil, and timber (hereafter col-lectively "high-deforestation-risk commodities", HDRCs) to monitor and reduce their impact on the world's forests.
In the public and third sectors (non-governmental organisations, NGOs), there has been intense activity in the development of REDD+ (Reducing Emissions from Deforestation and forest Degradation; the sustainable management, conservation, and enhancement of forest carbon stocks in developing countries; (UNFCCC, 2010)). This currently takes the form of, inter alia, bilateral arrangements (e.g. governments of Norway and Indonesia's USD 1 billion REDD+ Letter of Intent), large multilateral programmes (e.g. the UK government's GBP 3 billion International Climate Fund, the World Bank's Forest Carbon Partnership Facility, and the UN-REDD programme), and localised projects financed through the voluntary carbon market (e.g. see Goldstein et al., 2014).
HDRC compliance and REDD+ will require (Task I) the quantification of above-ground forest biomass (AGB), and both will certainly require (Task II) monitoring change in forest over time. This paper addresses the technical aspects of both of these tasks, focusing on the quantification of AGB and its change over time in the same analysis. However, Task II can be achieved without initially quantifying AGB (Joshi et al., 2015), e.g. measuring solely the area deforested.
The forest areas of concern are vast and remote, necessitating the use of remote sensing (RS) techniques, typically the analysis of images captured from satellites or aircraft. With existing techniques and cloud-free optical data it is relatively simple to detect forest change. However, ideally analysts would use time series of high-resolution AGB maps (e.g. from lidar) to accurately detect DD and any forest regrowth and to quantify the associated biomass changes simultaneously. Yet there are major challenges to measuring biomass: no satellite sensor directly measures it , and relationships between remote sensing data and biomass tend to break down at medium to high AGB levels, meaning there is a loss of sensitivity to high-biomass forest (Mitchard et al., 2009). Hence the initial AGB map (at time t 0 ) will contain errors, as will maps for subsequent time periods (t 1 , t 2 , . . . , n). Therefore detecting biomass change over time is a more troublesome proposition still, since the errors in each map must be well understood in order to be able to correctly infer change over time. In the absence of such well-understood uncertainties, Tasks I and II must be integrated to measure AGB change. We postulate that by distinguishing between these tasks, uncertainty in the carbon stocks (typically quite high) can be separated from uncertainty in the change maps (often low). This should produce change estimates with narrower and better-defined confidence intervals than those created by directly differencing biomass maps

Task I: AGB estimation
For Task I, Mitchard et al. (2012) characterised the options available as (a) the classification of forest into land cover types, which are then attributed a mean AGB value based upon field or remote sensing measurements, or (b) the direct regression, or more complex machine-learning algorithms, between point AGB estimates and a single variable or set of remote sensing variables. Approach (a) largely maps onto the Tier 1 and Tier 2 approaches for REDD+ monitoring proposed by the United Nations Framework Convention on Climate Change (UNFCCC).
Tier 3, which involves local modelling, probably involves approach (b) (Arino et al., 2009). Option (a), forest classification, can be performed using the properties of sunlight reflected from the surface of the forest canopy (passive optical remote sensing, e.g. using the LANDSAT satellite series). It also can be undertaken using active sensing technologies such as synthetic aperture radar (SAR) acquired at low (e.g. L-band) frequencies. Sensors operating at L-band include the ALOS PALSAR or ALOS-2 PALSAR-2. However, this forest-classification approach does not reflect variations in forest within classes, leading to coarse AGB maps. Furthermore, optical imagery typically suffers from interference from cloud and smoke over forest areas; hence multiple image acquisitions are required to make the final forest classification. For these reasons, option (b) (direct estimation) is more attractive.
One of the most promising RS variables for option (b), direct regression, is SAR backscatter. SAR involves focusing a beam of microwave energy at the forest and using the backscattered energy to make inferences about the properties of the target. The longer (than visible light) wavelengths of SAR mean that the signal does not interact with water or particulates in the atmosphere; hence it can "see" through cloud and smoke. Since the radiation interacts with the structure of the forest itself, it can be statistically related to AGB Morel et al., 2011). However the SAR signal saturates at some level of forest biomass typically between 60 and 150 Mg ha −1 for L band, depending on the polarisation and environmental conditions (Lu, 2006;Mitchard et al., 2009). Hence AGB modelling must be limited at this maximum level of sensitivity, or else any pixel with AGB greater than this value can be ascribed a "high-biomass forest" value, e.g. as taken from forest plots. Longer wavelength SAR has potential for much higher saturation points, but no satellite collecting data longer than L-band currently exists. However, the P-band BIOMASS satellite has been funded by the European Space Agency, and should launch in 2020 (Le Toan, 2011).
The only operational RS technology that can estimate the biomass of tropical forest without saturation at this level is light detection and ranging (lidar). This active sensing approach involves emitting pulses of laser light at a target (the forest) to determine structural information and thereafter AGB (Lefsky, 2010;Asner et al., 2010). Yet landscape coverage to make AGB maps is only possible using aeroplanes as the sensor platform. For instance, Asner et al. (2010) measured 514.3 kha in Peru at a spatial resolution of < 1 m, yet these data needed to be integrated with moderate resolution (30 m) satellite data to produce a final landscape-level map. Moreover, using aeroplanes as the platform makes data acquisition very expensive, and costs rise further due to complex data processing requirements, especially when repeated acquisition is required for monitoring. This cost represents a significant barrier to lidar's wide adoption and operationalisation as a forest-monitoring tool.
Hence, there is a monitoring problem: optical imagery typically cannot be related directly to AGB and hence relies on classification techniques, and is plagued with the problems of cloud cover. A SAR signal can penetrate cloud and smoke, increasing the regularity of usable observations (effectively whenever a SAR image is captured, compared with only a small subset of optical images). Moreover, SAR backscatter can be directly related to AGB given a sufficiently long wavelength. However SAR signals saturate at AGB levels well below those found in mature tropical forests. Lidar can provide AGB estimates, yet the mapping of large areas still requires integration with other data sets.
Here we present one solution to this problem that may be implemented now, by combining the options set out above. This possibility arises because lidar footprints from the Ice, Cloud and land Elevation Satellite (ICESat) Geoscience Laser Altimeter System (GLAS) sensor provided dispersed lidar samples across the Earth's surface, including over tropical forests. These data can be statistically related to -and used in conjunction with -other freely available remote sensing data from sensors like SAR which do provide full coverage and actively sense forest structure (Shugart et al., 2010). This is because both approaches actively sense forest structure, measured either through differentiated laser light returns in lidar, e.g. from forest floor and canopy, and, in the present case, the degree of volume scattering in SAR returns. Though these lidar data are the same fundamental input, the method we propose is different to that in Saatchi et al. (2011), which involves a machine-learning approach at a coarse resolution, and that of Mitchard et al. (2012), which uses SAR data to perform a classification and then populate the classes with AGB based on the lidar. Our approach involves two stages: direct regression for AGB mapping using a single year of SAR data, followed by an independent change detection process using multi-temporal SAR data.
Specifically, our approach is to integrate L-band SAR (Phased Array type L-band Synthetic Aperture Radar, PAL-SAR, λ 0.23 m; on board the Advanced Land Observing Satellite, ALOS) with 4 years of data from the spaceborne lidar sensor (ICESat GLAS; 10 944 footprints from 2003 to 2007) in order to greatly supplement a small biomass field data set of 56 field plots. By modelling relationships between these three data sets, we are able to quantify AGB in the reference year 2007, with an increased sensitivity to higherbiomass forest than would be the case using SAR alone.

Task II: forest and AGB change detection
For Task II, the options are to characterise the possible states of the forest system and to measure the change in state over time. Typically this involves some form of categorising or "binning" forest into classes. For instance, an area of forest may change from intact forest to degraded forest, from degraded forest to non-forest, or from non-forest to plantation. The changed pixels can be related back to an original AGB map (if available) and AGB loss and carbon flux calculated; otherwise, statistics on the areas of forest lost and degraded can be generated.
Historically such change assessments have been undertaken by using optical satellite imagery. For instance, in an assessment of the impacts of protected areas (PAs) in Sumatra, Gaveau et al. (2009) used Landsat images from 1990 and 2000 to measure deforestation, whilst more recent efforts integrate Moderate Resolution Imaging Spectroradiometer (MODIS) data in addition to Landsat. Broich et al. (2011a) used this combination to map forest change across Sumatra and Kalimantan. This work highlighted the central problems of both identifying forest type from optical remote sensing imagery and using composite images from several different time periods. Composites are necessary when clouds obscure parts of the study area in the first image collected; using cloud-free sections of later images ultimately allows the creation of a largely cloud-free image. Yet cloud-free imagery for those areas obscured in the first image may not be available for months or even years after the first image is collected. Since forest is being cleared and replaced with plantations very rapidly in places like Indonesia, this means that composite images do not incorporate the deforestation and regrowth that has occurred in the time period during which the composite was created (Hansen et al., 2008(Hansen et al., , 2009). Change detection based on these composites may therefore underestimate the extent of forest change. One solution is to use algorithms to develop pixel forest histories (Broich et al., 2011b), yet this approach seeks a solution more in inference than in data. In a more recent Sumatra-wide study using Landsat and lidar, Margono et al. (2012) reiterate these interacting monitoring challenges of high cloud cover and rapid regrowth. Nonetheless, optical data have been used to produce impressive multi-year global forest change products across habitat types (Hansen et al., 2013).
Our novel solution for Task II is to use interannual threshold-delimited differencing of L-band SAR data to provide annual DD estimates. We use these changes in conjunction with the map produced as a solution for Task I in order to measure AGB loss and estimate CO 2 emissions. We test this approach for a section of Sumatra, Indonesia. This is an ideal study site because Indonesia has an extremely high deforestation rate, which reached 2 Mha yr −1 in 2011 to 2012 (Hansen et al., 2013). At the same time, there is considerable action required to be taken in order to address this, including the proliferation of REDD+ and HDRC monitoring activity, Figure 1. A map of the western islands of the Indonesian archipelago. The island oriented north-west to south-east is Sumatra. The section highlighted is our study area of 10.7 Mha. The underlying data for that section constitute our estimate of AGB for 2007. Dark-green areas are those with high AGB, and lighter coloured areas have very low AGB. In the northernmost tip of the image is the dark green of Berbak National Park, reflecting its relatively intact status. It was from this site that we gathered our field data via ZSL, which operates a pilot REDD+ project there. The south of the study area terminates at the Bukit Barisan mountain range.
for instance with Norway committing USD 1 billion to a bilateral REDD+ deal (Norwegian Embassy, 2011;Solheim and Natalegawa, 2010), and via the Roundtable on Sustainable Palm Oil (RSPO).

Field site
Our forest plot data are from Berbak National Park (BNP; 104 • 20 E, 1 • 27 S), a peat swamp in Jambi province, Sumatra, covering 140 000 ha. It is habitat for the critically endangered Sumatran tiger (Panthera tigris sumatrae; IUCN, 2013) and 23 species of palms, making it the most palm-rich peatland swamp known in SE Asia. The Zoological Society of London (ZSL) has established a pilot REDD+ project here known as the Berbak Carbon Initiative (BCI), managed in partnership with the government of Indonesia (GoI). However, since the SAR data for this study were available at a far larger extent than that of the project site, we expanded the analysis to a scene which covered portions of both Jambi and South Sumatra provinces, covering 10.7 Mha. A map of the study area is provided in Fig. 1.
These provinces were once entirely covered by megadiverse Sundaland lowland rainforest, supporting, inter alia, the world's largest (Rafflesia sp.) and tallest (Amorphophallus sp.) flowers, the Sumatran rhinoceros (Dicerorhinus sumatrensis), and stands of ironwood (Eusideroxylon zwageri; Whitten et al., 1984). The forest types range from mangrove forest, lowland peat swamp forest, and lowland terra firme forest through to hill and montane forest in the Bukit Barisan mountains (ibid.). However, this description is now largely historical: the expansion of industrial logging, followed by transmigration of Javanese settlers, and oil palm (Elaeis guineensis) plantation development has led to extensive DD (e.g. Whitten et al., 1984;Gaveau, 2013;Broich et al., 2011a;Broich et al., 2011b;Miettinen et al., 2011). Hence anthropogenic land cover is increasingly dominant, in particular with oil palm and "fastwood" (Acacia sp.) plantations expanding to meet international food, energy and wood pulp demand, whilst coconut plantations have expanded along the coastline.
Using land use planning GIS shapefiles provided by the Indonesian government to ZSL, we calculated that 1 % of the area is designated as community forest, 26 % as production forest, and 10 % protected forest. The majority is designated for non-forest use (60 %), e.g. for cities and agriculture. It should be noted that these are aspirant land use designations: their implementation in Indonesia is complicated (Collins et al., 2011).

Field plot data
ZSL undertook a carbon stock assessment during the initial phase of the REDD + pilot project, collecting data be-tween October 2010 and August 2011. This involved including forest AGB estimation using forest plots. Plot locations were chosen through stratified random sampling, based upon a habitat classification map of Berbak National Park using 2008 SPOT V imagery analysed by ZSL Indonesia. In the field, plot locations were verified with a Garmin 60CSx handheld GPS unit. A total of 56 plots were sampled, with 36 in primary swamp forest, 14 in swamp bush, and 6 in secondary peat swamp forest. The plots were nested, constituting the following: 1. the main plot of 20 × 125 m plot recording stem The AGB for each tree in each subplot was then calculated using an allometric equation for wet tropical forests, where where ρ is oven-dry wood over green volume (wood density, g cm −3 ), d is diameter (cm) at breast height (DBH; 1.3 m), η is tree height (metres) (Chave et al., 2005). Wood densities were collected from the literature for Indonesia peat swamp trees (Murdiyarso et al., 2010). There were no palms recorded in the plot data, yet they may be among the 5.3 % of the stems that were unidentified by the field team. Future research may identify these species and identify specific allometric equations and wood densities. However, for the present analysis, we followed the Food and Agriculture Organization recommendation of the use of an arithmetic mean for tree wood density where trees are not individually identifiable in the field plots. This is 0.57 g cm −3 for Asia (Reyes et al., 1992), or a generic 0.58 g cm −3 (Chave et al., 2004). We used the former figure.

Calculating tree height
Tree height data were not recorded from the forest plots by the field team. Equations published by Morel et al. (2011) were therefore used to relate tree height to DBH for SE Asian trees, whereby height η, for stems where d < 20 cm, is and where d > 20 cm is The estimated height for each stem was then used to calculate Lorey's height (L) for each of the plots. We did this because L is the closest to what the ICESat GLAS waveforms measure (Lefsky, 2010). Lorey's height weights the contribution of trees to the stand height by their basal area. It is calculated by multiplying tree height η by its basal area α and then dividing the sum of this by the total stand basal area.

Estimating the relationship between the measured biomass and height
The next step was to calibrate the relationship between plotlevel AGB estimates and Lorey's height estimated in the steps above. This involved following the approach of Saatchi et al. (2011) and Mitchard et al. (2012), which is to estimate a non-linear least-squares regression: y = a × (x b ). We estimated this using the NLS function in R (R Core Team, 2013). We performed the final regression excluding the 14 swamp bush plots to avoid bias of elevated R 2 values, and reduced RMSE values. This changed neither the regression coefficient nor exponent.

SAR and lidar data
We downloaded ALOS-PALSAR mosaics from 2007 to 2010 from the Japanese Aerospace Exploration Agency (JAXA) website (JAXA, 2014). The PALSAR data are collected in two polarisations -horizontal send, horizontal receive (HH) and horizontal send, vertical receive (HV) -and are provided at 25 m resolution. We aggregated this by taking the mean of a 4 × 4 pixel window, as a multilooking procedure to reduce speckle. Since an initial change detection produced noisy images, we then used an enhanced Lee filter with a 3 × 3 window on each of the now 100 m HH and HV rasters, using ENVI (Exelis) and the default parameters. Lidar data were taken from the ICESat GLAS sensor. These data were collected between 2003 and 2007, and provide waveforms for transects across the Earth's surface. The final data used here were the estimates of Lorey's height from each waveform derived from coincident tropical ground data, as used by Saatchi et al. (2011). On examining the data in a GIS, there were clearly many footprints over areas that were known to be covered in forest (from field observations) but that were influenced by smoke and cloud cover because they had Lorey's height values of 0 m. To resolve this problem we used an independent land cover data set from the European Space Agency (ESA) called GlobCover (Bicheron et al., 2009). This provides estimated land cover type across the study area, and at 300 m resolution it is the highest-resolution land cover data available. We extracted the GlobCover land cover type for each footprint and then filtered the lidar footprints for any false negatives by removing those footprints which had Lorey's height values of 0 m but which were classified as forest in GlobCover. We excluded any data points which were over non-forest areas. Through use of this pro-6642 M. B. Collins and E. T. A. Mitchard: Integrated radar and lidar analysis cess, 11 031 lidar footprints were removed, leaving 10 944 points remaining for calibrating the SAR data.

Calibration of SAR and lidar data: creating a forest height map
For 2007 we calibrated the SAR data in decibels (dB) with the lidar data by modelling a functional relationship between the Lorey's height measurements and the HV backscatter value of the pixels in which the lidar footprints fell. However, lidar data over this type of mixed and degraded forest landscape typically contain many more data points at lower values of Lorey's height, with very few readings greater than 30 m. We wanted to develop the best functional relationship between these values to allow for a prediction of height. For such an ideal regression a similar number of Lorey's height estimates are necessary at all SAR backscatter levels. By contrast, a regression on all values would be biased towards a fit at smaller values of both variables. Therefore we binned the data, whereby we calculated the mean backscatter at each Lorey's height interval (0, 1, 2, . . ., 25 m) using the aggregate function in R (R Core Team, 2013).
A physical limitation of the L-band SAR data is that they do not fully penetrate the forest canopy, and the signal saturates at higher biomass levels (Mitchard et al., 2009. This is demonstrated by a change in the functional relationship between the Lorey's height measurement from lidar and the HV backscatter, which occurs at approximately 25 m Lorey's height in this instance, corresponding to 190.6 Mg ha −1 , as shown in Fig. 3. Therefore we modelled the relationship using a non-linear regression estimated in R, taking the natural logarithm of the Lorey's height, i.e. The relationships using the HV backscatter were superior to those developed using the HH backscatter, and so we continued the analysis using only this polarisation (e.g. Mitchard et al., 2009). We then applied the functional relationships between backscatter and Lorey's height to the 2007 HV backscatter raster using Eq. 1. In practice this meant calculating Lorey's height L using This created a map for 2007 which estimated Lorey's height per pixel.

Excluding agriculture and plantations from the Lorey's height map
Since our analysis concerns the loss of natural forest only rather than AGB in all land cover types, we excluded those . Figure 2. Relationship between Lorey's height and biomass as measured in the forest plot data from Berbak National Park. We performed the final regression excluding the 14 swamp bush plots to avoid bias of elevated R 2 values, and reduced RMSE values. This changed neither the regression coefficient nor exponent. R 2 = 0.61; RMSE = 113 Mg ha −1 pixels which had a modelled Lorey's height < 20 m from the subsequent analysis. We considered that trees at this height would be natural forest rather than plantation. Further, our model estimates that forest 20 m high has AGB of 123.7 Mg ha −1 , whereas a study on neighbouring Borneo also using ALOS PALSAR found that the mean biomass of plantations was 53 Mg ha −1 , with values above this on average representing natural forests . Therefore, by choosing this forest height limit of 20 m, and hence AGB of 123.7 Mg ha −1 , we greatly increase our confidence that we have excluded plantations from our maps and hence also plantation cropping cycles in the subsequent change analyses. We also deemed our restriction to be in keeping with the definition of "forest" under the Marrakesh Accords (UNFCCC, 2001).
Next we undertook spatial filtering. We wrote a moving window function in R based on the focal function from the raster package (Hijmans, 2013) and applied it to the 2007 Lorey's height map. For each 5 × 5 pixel window, if ≥ 20 (80 %) of the pixels were estimated to contain forest of ≥ 20 m Lorey's height, we included all of those pixels in the subsequent analysis. Otherwise, if < 20 pixels were estimated as forest, we excluded all these pixels. This will result in the exclusion of small patches of remnant natural forest and hence ultimately to underestimation of the 2007 AGB stocks. However, it further allows us to increase our confidence that we are excluding plantations from the analysis, and allows us to focus instead upon mapping the biomass and the deforestation of Sumatra's last intact contiguous highbiomass forest. Visual comparisons of the resulting map with Google Earth data and our own field knowledge suggested that these processes had indeed masked out plantations without removing any large areas of natural forest.

Creating the 2007 biomass map
In order to create the final biomass map for 2007, we applied the relationship between Lorey's height and forest plot biomass (Eq. 10) to the Lorey's height raster created above. We processed all data with UTM projection (48S) at 100 m resolution in order to readily calculate biomass stocks per hectare.
To account for the saturation of the HV backscatter signal and hence functional relationship at this point, we limited the modelled biomass estimate at 196.6 Mg ha −1 . For any pixel > 196.6 Mg ha −1 , we attributed a mean biomass value taken from the Berbak forest plots with > 25 m Lorey's height. This was 236.5 Mg ha −1 (n = 8; SD = 75.7 Mg ha −1 ). This figure is more conservative than the generic 350 Mg ha −1 for Asian forests as suggested by the IPCC (Penman et al., 2003;IPCC, 2006).

Calculating errors and uncertainties
In a study estimating biomass there are a combination of random and systematic errors propagating throughout the calculations. Mitchard et al. (2011) characterise the errors as those concerning (a) accuracy and (b) precision. Accuracy concerns the distance of the mean from the true value and hence systematic biases. Precision concerns the distance of a measurement from the mean of multiple measurements of the same attribute and is this due to random errors. In a comprehensive review of errors in biomass estimations, Chave et al. (2004) highlight how in practice these errors can occur when, for instance, taking the measurements of the individual trees themselves, random errors in the identification of tree species, and spatial errors relating to geo-location.
We considered each of the potential sources of error in turn, namely those deriving from the binary forest map from the ESA; the tree species identification, and height and AGB estimations; errors in the lidar data and Lorey's height estimates; and the relationships estimated between lidar and SAR backscatter. In order to combine these multiple errors, which we assume to be uncorrelated, we used the following formula to determine uncertainty (U ; Saatchi et al., 2011): 2.6 Deforestation detection

Radiometric normalisation of the 2008 : 2010 HV backscatter rasters to the 2007 data set, and additional processing
Annual variations in measurement conditions, such as moisture on the ground and in vegetation, introduce variance in backscatter between years which does not constitute changes in forest state. In the wet tropics these changes can be large. For change analysis this represents a problem. Any differencing between data sets over time for change detection could result in errors whereby backscatter changes reflect differences in moisture rather than real changes in the forest. In order to correct for this, the data need to be radiometrically normalised such that the measured properties of a pixel in year t 0 approximate the properties of the pixel in year t 1 where no land use change has occurred. In order to do this with the SAR data, we randomly extracted 25 000 pixels from all four HV backscatter mosaics from 2007 : 2010 in order to ensure a distribution of backscatter values. We used these data to develop a linear relationship between each pixel over time, using a reduced major axis regression model estimated in R (Legendre, 2014) and on the basis that any sampled pixels which were deforested during the study period would constitute errors in the regression. We applied the resulting relationships to the 2008 : 2010 data so that the mean backscatter of each scene approximated that of 2007. We demonstrate the results of this process in Fig. 6. The figure shows the distribution of values of 48 977 pixels extracted from stable core forest areas of Berbak National Park. Prior to the normalisation procedure, there is interannual variation in the backscatter values, particularly in 2009. However, following the procedure the distributions converge. Hence we increased confidence that any large changes in the backscatter values per pixel were attributable to changes in the properties of the SAR target, specifically deforestation. Finally, the data provided by JAXA are already terrain-corrected and provided in gamma nought (γ 0 ) geometry. Hence we did not apply any further terrain correction.

Exclusion of flooded areas
Seasonal flooding can cause changes in SAR backscatter that could subsequently be misinterpreted as deforestation, which is unlikely to be corrected using radiometric calibration. Flooded forest has high backscatter values in the horizontal-send, horizontal-receive (HH) polarisation relative to the horizontal-send, vertical-receive (HV) polarisation. This is because, in the HH polarisation, there is a double bounce of the SAR signal between the water surface and the structure of the forest which increases the HH backscatter value relative to HV. Thus flooded forest can be detected by looking at changes across space in the ratio of these two polarisations. We excluded any areas identified as natural forest (calculated in the section above) ≥ 20 m height but which had an HH value of > −5 dB. These excluded areas appear as white "ribbons" through the intact forest blocks in Fig. 4, alongside the region's rivers. Additional visual verification of the efficacy of the approach is provided in Fig. S1 in the Supplement.

Change detection: the determination of deforestation
In order to determine deforestation we calculated the difference in Lorey's height for each time step: 2007-2008, 2008-2009, and 2009-2010. We used the Lorey's height maps for two reasons. First, the relationship between Lorey's height and HV backscatter is non-linear. Hence the change in backscatter in a pixel implies a change in Lorey's height and therefore forest state that is conditional upon the original backscatter value of that pixel. This means it was not possible to simply take a difference in the HV backscatter between years to detect change. Second, forest height is a more intuitive property than HV backscatter.
Whilst there is small-scale degradation in addition to deforestation at the study site, we are concerned here with land use change as a binary, exclusive event in natural highbiomass forest. The threshold we used to define change between years represents a tradeoff between sensitivity and uncertainty. The lower the threshold for change detection, the more sensitive the process is. However, the more sensitive the process is, the greater the chances that SAR speckle is detected as false positive deforestation. Ultimately we used a threshold of 10 m reduction in Lorey's height per pixel per year to indicate deforestation. This is because a change of this magnitude in a pixel we had assessed to be natural, non-flooded forest in 2007 would necessarily reflect a movement from high HV backscatter, high Lorey's height, and high biomass (i.e. intact high-biomass natural forest) to a low backscatter value associated with low Lorey's height and low biomass (deforested pixel). This explanation is more readily understood with reference to Fig. 3.
In practice, to detect change, we had to both calculate a series of Lorey's height maps and account for how the errors in the HV Lorey's height relationship would propagate into the change maps. First, to produce Lorey's height maps for each year, we applied Eq. (6) to each of the radiometrically corrected annual SAR scenes 2008, 2009, and 2010. We then considered the proportional errors (δ; ratio of regression error RMSE to maximum height estimated, 25 m) in the relationship between HV backscatter and Lorey's height. To be conservative, for each time step, we calculated the minimum estimated Lorey's height for time t (L t min ), and from this we subtracted the maximum estimated Lorey's height for t + 1 (L t+1 max ). We calculated the minimum Lorey's height estimate by multiplying the Lorey's height estimate map by 1 − δ, and we calculated the maximum Lorey's height estimate by multiplying the Lorey's height map by 1 + δ.
Therefore the forest height change ( L) calculation for a given time step was We may now substitute in Eq. (6) for each of the Lorey's height estimates and apply the minimum and maximum error calculations:  flooded (exclude HH > −5 dB ), had its height reduced by > (10 m) in the subsequent year, and had not experienced deforestation in any of the previous time periods.

The relationships between Lorey's height and forest plot biomass
The forest plot data from Berbak National Park yielded a power relationship between estimated values of Lorey's height and AGB, which explained almost two-thirds of the variation in the data (R 2 = 0.61; RMSE = 113 Mg ha −1 ). The plot data range from those with very few trees and hence low AGB and Lorey's height values through to the primary forest plots of AGB > 300 Mg ha −1 and Lorey's height values of ≈ 30 m. The resulting equation is shown in Eq. (10) and is plotted in Fig. 2. AGB = 0.37L 1.94 (10)

The relationship between SAR HV backscatter and Lorey's height from lidar
The relationship between HV backscatter and Lorey's height appears to be approximately linear from very low values of Lorey's height clustered at a mean of ≈ 0 m through to high values of ≈ 25 m. Figure 3 illustrates this relationship. Values in the upper-right portion of the graph have both high Lorey's height values and high HV backscatter values. We interpret these as representing mature forest with high AGB. In the bottom left of the graph are data points which have low AGB and low Lorey's height values, which we interpret as being deforested. This graph is also central to the change detection procedure since it demonstrates the logic behind the choice of the 10 m height change threshold. If a pixel with forest ≥ 20 m in t (top right of graph) experiences a height reduction of > 10 m in time t + 1 (moves to the lower left of the graph), we interpret this as a deforestation event (though note Eq. 9, which illustrates how we deal with error propagation). This is the deforestation process with respect to HV backscatter and is represented by the arrow pointing downwards to the left. The functional form of this relationship is summarised in Table 1.

Forest biomass stocks
By integrating the field plot data, the Lorey's height data, and the HV backscatter data; excluding flooded forest pixels; and summing the stocks across all the 100 m × 100 m pixels, we estimate 274 Tg AGB stored in forest ≥ 20 m in height across the 10.7 M ha study area in 2007. We provide an AGB and uncertainty map in Fig. 5. Relatively little high-biomass forest remained in 2007, and what did still remain was highly fragmented. The largest block of remaining intact forest in the study area was Berbak National Park/BCI in the northeast tip of the scene. The large treeless area in the centre of the park in this image is a burn scar from the devastating El Niño fires of 1996/1997.

Change detection and AGB loss
Our analyses suggest that a total of 137 367 one-hectare pixels were deforested between 2007 and 2010 in our study area. This represents a loss of 11.4 % of the 2007 high-biomass forest cover, a mean deforestation rate of 3.8 % yr −1 . This deforestation constitutes a loss of 11.3 % of the 2007 AGB. The figures differ since not all (89 %) of the deforested pixels were in the highest biomass forest of 236.5 Mg ha −1 . This suggests first that deforestation is occurring in different forest types, both in the last remaining old-growth high-AGB forest and in the lower-AGB intact forest (the minimum forest height we consider is 20 m; 123.7 Mg ha −1 ). Second, a visual inspection of the patterns of forest loss suggests two different types of deforestation across space. The first may be characterised as scattered losses in forest that was already highly fragmented in 2007. We suggest that this represents clearance by small-scale loggers and farmers. This kind of deforestation is typified by the forest lost between 2007 and 2008 in the central-southern part of Fig. 4b. The second type of deforestation we observe is large-scale geometric patterns, which we suggest are characteristic of timber concessions development and their conversions into plantations, a process through which virtually all AGB is removed. This is typified by Fig. 4d. In the west of this image, we observe forest clearance which advances into the remaining natural forest in annual waves from 2007 to 2010, which we visually verified using high-resolution imagery from 8 May 2009 in Google Earth at 1 • 53 40.71 S, 103 • 52 56.69 E, as shown in Fig. 4e. In 2008-2009 we observe the construction of a road or canal running NE-SW, connecting two large clearings (the feature is shown in the centre of Fig. 4d, zoomed in upon in Fig. 4c). This particular image demonstrates well the level of detail which is possible to map using this approach. Discussions with the ZSL team suggest that the deforestation in the east of Fig. 4d was the result of a road-building project linking Jambi to South Sumatra provinces, whilst the forest either side of the road was affected by illegal logging.
Berbak National Park is experiencing no large-scale deforestation; however, the maps do show more scattered pockets of small-scale forest loss which are more typical of the creation of small fields and small-scale illegal logging operations that affected many of Indonesia's national parks during the study period (Collins et al., 2011).
By aggregating all the changes across the scene we were able to estimate the total amount of AGB removed from the study area annually. We also provide potential emissions from this loss of AGB, based on an extreme scenario in which all the AGB was completely oxidised following its removal from the landscape. However, there are uncertainties involved in these calculations. Their estimation and subsequent integration into the final results are discussed below.

Binary forest map from ESA
We used a binary forest/non-forest map from the 2005 ESA GlobCover (MERIS) to remove lidar points which had a value of zero, but which were over forest areas, hence experienced cloud and smoke interference. This had the potential to cause three potential problems. (1) This land cover classification contains errors, which are introduced into lidarbackscatter relationships for non-forest vegetation. Indeed the classification's creators describe forest area overestimation where data are poor (Bicheron et al., 2009). (2) The lidar data were collected between 2003 and 2007 and thus overlap with the MERIS data set. Nonetheless, given the rate of change observed in this study, land cover change could have occurred between the collection of the two data sets. (3) The GlobCover data have a relatively coarse resolution of 300 m, meaning some non-forest areas will have been classified incorrectly as forest and vice versa. Artefacts relating to these errors will increase noise in the relationship shown in Fig. 3 but should not change the absolute relationship, which is dominated by the signal in the data. We do not believe that these errors are significant; see Fig. 3 for the clear relationship between lidar-derived Lorey's height and HV backscatter, with the fit having an R 2 of 0.91.

Tree species identification, height estimations, and AGB estimations on forest plots
Tree identification is an ongoing endeavour in Indonesian peat swamp forests. Accordingly, the field team botanist had difficulty identifying some tree species (5.3 % stems). Hence it was not possible to specify wood densities for these individuals. Were improved tree identification and wood densities to become available, we would be able to increase the accuracy of the biomass map. In addition, the forest plots data did not contain tree height measurements, requiring use of a published height-to-DBH relationship for SE Asia from Morel et al. (2011). Yet morphological differences between peat swamp trees and those measured by Morel may introduce errors into our biomass estimations. In addition, the model for stems where d < 20 cm was poor, with an R 2 value of only 0.16. This means that the predictions for the smaller stems are likely to have quite low accuracy, which is expected to have introduced further errors into the estimates of height. However, the majority of forest biomass is typically found in large trees (Slik et al., 2013), rendering this problem of marginal importance. Nonetheless, more forest plot data that included tree height measurements would improve our calibrations. A further consideration is that the relatively small plot size may have introduced errors into our calibrations (Rejou-Mechain et al., 2014). Nonetheless, it should be noted that the relationships we detect here between Lorey's height and AGB, and between GLAS footprint-based Lorey's height and radar backscatter, are identical in form and similar in parameter to those described elsewhere Mitchard et al., 2012). This increases our confidence in the robustness of the calibrations. A final issue is that, in order to calculate AGB, it was necessary to use pan-tropical rather than regional allometric equations. In order to account for the errors in the estimation of biomass in our plots and potential regional differences in estimates of biomass, we ascribe a 20.3 % error (Djomo et al., 2010).

Lidar and Lorey's height estimates
The relationship that was used to develop estimates of Lorey's height from lidar returns is based upon field plots in the Amazon (Lefsky, 2010). To deal with the errors that this will create, a 5 % error is ascribed to potential differences in regional estimates of Lorey's height from the waveforms as suggested by Mitchard et al. (2012).

Relationship between lidar and SAR backscatter
There are errors in the estimated relationship between the estimated Lorey's height and SAR backscatter. The RMSE was used to quantify this, which is a measure of the difference between the values implied by an estimator in a statistical relationship and the true value of the parameter being estimated. For the relationship estimated between the 2007 HV backscatter data and the Lorey's height data, the RMSE is 3.3 m. We calculated the percentage by dividing the RMSE by the maximum forest height we used from the lidar data, multiplied by 100. That is, (3.3/25) × 100 = 13.2 %.

Combining uncertainties, and final forest change results
With 20.3 % error for the biomass calculations for the trees and 5 % Lorey's height errors, and 13.2 % error for the relationship between Lorey's height and HV backscatter, we estimate 24.7 % total uncertainty using Eq. (7). We applied these uncertainties to the biomass and change calculations to produce the final results: M. B. Collins and E. T. A. Mitchard: Integrated radar and lidar analysis 6649

Calibration over space
We calibrated the SAR data using ground plots from the peat swamps of Berbak. However, the relationship may differ in other forest types type, and so the analysis may be enhanced by having calibrations in different areas by partitioning the backscatter data and using additional regional plot data. Unfortunately, in the absence of additional forest plot data sets, this was not possible.

Detecting biomass in mangrove forests
Not all ecosystems are equally well detected by SAR. One example is mangrove forest. This may be because mangrove forest's low, open canopy and extensive prop root networks absorb much of the L-band radiation. There is even evidence that the relationship between AGB and HH backscatter is negative (Cohen, 2014). This is relevant because there is a mangrove forest in our study area to the south-east of Berbak, within Sembilang National Park. We removed this mangrove forest from the analysis during the process whereby all pixels in the modelled height raster with a value ≤ 20 m were excluded, verified by visual examination of the resulting maps, as shown in Fig. 4.

Underestimation of biomass loss overall
The biomass loss and emissions estimates provided are conservative. First, the maximum biomass estimate of mature forest is limited, due to SAR backscatter saturation. Second, mangrove forest biomass is excluded. Third, the large below-ground biomass emissions associated with the clearance of forest on peat soils are not included (Page et al., 2002). Fourth, the forest plots from Berbak are not likely to be representative of all the forest types across Sumatra. Fifth, we apply a very restrictive threshold of forest > 20 m height, plus a majority-value window to focus solely on the intact high-biomass forests in the change analyses. Therefore we strongly expect the true carbon loss values to exceed those given here by an undetermined amount. The bottom of our confidence intervals should be considered the minimum emissions that have resulted from this land use change, providing a conservative estimate that could be used in a GHG accounting framework.

Discussion and conclusion
We have demonstrated for the first time that it is possible to employ a fusion of SAR, lidar, and forest plot data to map AGB and its change across a tropical forest landscape. From a broader perspective our findings have implications (a) for forest-monitoring technology and methodologies, as well as (b) for, inter alia, biodiversity and ecosystem services, particularly climate regulation.

Forest monitoring technology and methodologies
Concerning the first set of issues, our results demonstrate the value of integrating multiple existing data sets in order to map AGB in an area with high-biomass forest, including peatlands. This was enabled by the establishment of robust relationships between (i) AGB and Lorey's height estimates from field plots and (ii) HV backscatter and Lorey's height estimates from lidar data, which increases by 2 orders of magnitude the number of observations of Lorey's height which we have from the 56 forest plots alone.
Rapidly changing forest provides a challenging context for analysis: the deforestation rates we observed would appear to substantiate the concern that multi-year optical composites to remove cloud cover may mask the very changes that the researcher intends to detect in the first instance (Hansen et al., 2008(Hansen et al., , 2009. Hence, our approach may be used as either an alternative to traditional optical analyses or as a complement for those areas particularly affected by cloud and smoke. Examining the per-pixel HV backscatter values over time allowed us to make spatially explicit estimates of forest biomass loss annually, and with quantified uncertainties. This represents a methodological deviation from the work to map deforestation using optical data. This provides a contribution to the call for accurate forest-monitoring data for Indonesia to contribute to REDD+ (Broich et al., 2011a). Being able to directly map biomass at 100 m spatial resolution unencumbered by cloud or atmospheric particulates represents a significant advance in the ability to monitor tropical forests for many stakeholders, and should be of interest to governments as well as firms in HDRC sectors, in addition to NGOs interested in forestry.
Nonetheless, there are some technical barriers to continued efforts using the methodology we present. Principally, following the failure of the sensor on ALOS, L-band SAR data were not collected again until 2014 with the launch of ALOS-2, leaving a 3-year data gap. Nonetheless, it appeared whilst browsing the Landsat archives for images of Berbak that the majority of images were obscured by cloud and smoke, meaning that, despite the data gap from ALOS being suboptimal, it is nonetheless comparable with LAND-SAT data over that same period.
Finally, the estimation of per-pixel AGB requires contemporaneous lidar data for calibrating the AGB map. However, the only freely available data set (ICESat) stopped collecting data in 2007. Yet plans are afoot for the deployment of ICESat-2 and GEDI, which will allow calibration of future SAR images. Furthermore, demand for forest monitoring has spurred a development in other options, particularly aerial lidar transect sampling. This takes the same approach as ICE-Sat, using lidar as a sampling tool only, recording data in transects rather than across the landscape. This offers some of the benefits of landscape-level lidar mapping by enabling the provision of accurate AGB estimates for different forest types, but with lower costs since only transects are recorded.
Such data could be used as an alternative to ICESat-2 data for calibrating L-band data to produce AGB maps. In addition, future work may not be restricted to ALOS2 data, with Argentina's SAOCOM and NASA's NISA L-band satellites planned for launch during this decade. These additional satellites may increase data availability and frequency of observations.

Significance of deforestation and forest degradation on Sumatra
Concerning the second set of issues, our results have broad implications. Indonesia is already widely known to have very high deforestation rates. However, even in this context, forest loss in Sumatra is particularly high. By 2010, the eastern regions of Sumatra had lost approximately half of the peat swamp forests existing a decade earlier, an extremely high loss rate of 5 % yr −1 (Miettinen et al., 2011). In one case in June 2013, 140 000 ha of forest was destroyed by fire in a 3.5 M ha area in Riau province (Gaveau, 2013). Even on the conservative and unlikely assumption that the entire area was forested previously, this represents the extraordinary loss of 4 % of the remaining forest in a single month. Our results serve to confirm these findings: the high national means of forest loss in Indonesia mask remarkably high losses on a local scale. Such extensive forest loss on Sumatra is having large impacts on biodiversity losses. Flagship species like tigers (Panthera tigris sumatrae) are critically endangered (IUCN, 2013). Even a decade ago, tiger biologists were already concerned about tigers being scattered as a meta-population living in increasingly disconnected forest fragments (Linkie et al., 2006): the rapid deforestation we have observed thus simply represents a ongoing and unmitigated trend in habitat loss. Our maps show how very little high-biomass natural forest now remains in this part of Sumatra.
As Sumatra's forest is cleared, there are huge associated CO 2 emissions both from fires and organic decomposition of AGB, as well as from below-ground biomass. These emissions are particularly high in the eastern Sumatran lowlands due to the presence of a blanket of peat which may contain an order of magnitude more carbon than the forest growing on it (Page et al., 2002;Jaenicke et al., 2008;Hooijer et al., 2010Hooijer et al., , 2012. Hence there is a spatially explicit issue: deforestation in peat swamps is likely contributing disproportionately highly to climate forcing compared to forest loss elsewhere, with peatland drainage and oxidation now accounting for up to 3 % of total anthropogenic CO 2 emissions (van der Werf et al., 2009).
Optimistically, the increase in the range of technologies available to monitor forest, including peatland forest, irrespective of cloud and smoke cover, may go some way to improving the transparency and sustainability of land use management practices. For instance, better data may contribute to the monitoring and verification of pulp paper and oil palm firms' commitments to zero deforestation, hence mitigating some of the impacts of the very rapid environmental change we have quantified here.

Data availability
The AGB map and the deforestation map are available in a repository at: http://datadryad.org/resource/doi:10.5061/ dryad.4cc5m.
The Supplement related to this article is available online at doi:10.5194/bg-12-6637-2015-supplement.
Author contributions. M. B. Collins wrote the R code, performed the analysis, and prepared the manuscript. E. T. A. Mitchard contributed to the analyses and writing and processed the SAR data.