Impact of human population density on fire frequency at the global scale

Human impact on wildfires, a major earth system component, remains poorly understood. While local studies have found more fires close to settlements and roads, assimilated charcoal records and analyses of regional fire patterns from remote-sensing observations point to a decline in fire frequency with increasing human population. Here, we present a global analysis using three multi-year satellite-based burned-area products combined with a parameter estimation and uncertainty analysis with a non-linear model. We show that at the global scale, the impact of increasing population density is mainly to reduce fire frequency. Only for areas with up to 0.1 people per km(2), we find that fire frequency increases by 10 to 20% relative to its value at no population. The results are robust against choice of burned-area data set, and indicate that at only very few places on earth, fire frequency is limited by human ignitions. Applying the results to historical population estimates results in a moderate but accelerating decline of global burned area by around 14% since 1800, with most of the decline since 1950. (Less)


Introduction
Wildfires are an important component of the earth system (Bowman et al., 2009;Harrison et al., 2010) through their impact on biogeochemical cycles , terrestrial ecology (Bond, 2008), land surface (Bond-Lamberty et al., 2007) and atmospheric constituents (Langmann et al., 2009) and processes (Andreae and Merlet, 2001). Wildfires require the ignition of sufficient amounts of dry fuel to be started, and favourable weather conditions and fuel continuity to spread. Humans influence all of these factor directly or indirectly (Bowman et al., 2011): they are the dominant source of ignitions except in sparsely populated regions (Kasischke and Turetsky, 2006), influence fuel load by grazing and other forms of land management (Archibald et al., 2009), actively suppress fires or impede fires through roads, clearings or sub-urban structures (Syphard et al., 2007).
Regional studies hint at a complex relationship between fire regime characteristics and human activity. Most of the published studies relate to fire density (number of fires per area and time) rather than fire frequency (fractional burned area per time interval). For example, Di Bella et al. (2006) found for South America that agriculture suppresses fire density in arid regions, but enhances it in humid environments. Syphard et al. (2007) found for California that fire density was explained well by distance to the wildland-urban interface, and by population density and vegetation type, but fire frequency only by vegetation type. For Russian boreal forests, managed forests generally had considerably higher fire density than remote unmanaged ones (Mollicone et al., 2006). Given its importance as an earth system variable, this study aims at investigating exclusively the impact of population density on fire frequency at the global scale. Fire frequency is a more relevant quantity than fire density, because it describes the probability of a point at the earth's surface being burnt within a given time interval.
A frequently observed pattern for fire density -as opposed to fire frequency -is one with few fires at low population density, a peak at intermediate levels (ca. 20-40 people per km 2 ), and a rapid drop above 100 people per km 2 (Syphard et al., 2007(Syphard et al., , 2009Archibald et al., 2009Archibald et al., , 2010. For fire frequency, however, the picture is much less clear: based on an analysis of satellite data, Archibald et al. (2009) found that fractional area burnt decreased for values above ca. 20 people per km 2 for Southern Hemisphere Africa, but showed no clear trend for values below. Lehsten et al. (2010) in an analysis for the entire African continent found that fire frequency generally decreased towards higher population density if other factors were taken into account. In a regional analysis for the Valencia province, Pausas and Fernández-Muñoz (2012) found that an increase in annual area burnt by at least one order of magnitude since 1940 coincided with an 80 % decrease in rural population density, and Moreira et al. (2001) found that reduced grazing activity as a result of land abandonment can lead to a substantial increase in fire activity.
Based on fire scars from tree rings and historical population data, Guyette et al. (2002) identified distinct historical phases regarding the way humans impact fire frequency in a study area of the midwestern United States. Fire frequency appeared limited by low numbers of human ignitions up to a population density of 0.64 km −2 , then reached an ignitionsaturated plateau where fuel was limiting, and then declined again for values higher than 3.4 km −2 due to human-caused landscape fragmentation. The area has many natural barriers to fire spread, and the authors note that for more homogeneous landscapes, an ignition-saturated fire regime might occur at considerably lower human population density.
Most historical fire regimes still exist today somewhere on earth (Bowman et al., 2011), and therefore global-scale analyses of satellite data of fire activity can be used to identify distinct patterns of human impacts on fire. Such global-scale studies based on satellite observations have found a strong human impact on the seasonality of fire activity (Le Page et al., 2010).
Because of the complexity of the problem, previous studies aiming at the impact of human population on fire frequency have either used regression tree analysis (Archibald et al., 2009), or generalised linear models (Lehsten et al., 2010), together with satellite-derived burned-area products. Here, we use a non-linear parameter estimation technique that combines approaches from global semi-empirical fire models (Thonicke et al., 2001) with optimisation techniques that allow the determination of optimal parameter values with uncertainties (Tarantola, 2005). Non-linear parameter estimation has been applied at the global scale to hydrological models (Yapo et al., 1998), a semi-empirical model of ecosystem carbon fluxes (Kaminski et al., 2002), or a process-based ecosystem model coupled to a model of land surface hydrology and plant phenology (Kaminski et al., 2012). Given a suitable mathematical formulation of the process under investigation (e.g. of the impact of population density on fire frequency), the optimised parameter values with their uncertainty ranges can be used to infer not only strength and direction of the impact, but also whether the impact of the variable under investigation is significant. Here, we consider two estimates to be significantly different if their 1 standard-deviation (1-σ ) error ranges do not overlap. Assuming Gaussian probability distribution, this means that the probability of the lower estimate to be larger than the higher estimate is less than 2.5 %, and the test corresponds to a level of significance of 97.5 %.
The main question addressed in this study is whether similar patterns can be found at the global scale as detected in regional studies: a regime where fire frequency is limited by human ignitions at the low end of population density, and a regime of fire suppression by humans with a decrease of fire frequency towards high population density. A further, related question is whether the association between global historical decline of fire frequency and increasing human population as derived from the charcoal record is consistent with present spatial patterns of fire frequency and human population density. We will address these questions based on an analysis of both observed and modelled spatial patterns of fire frequency at the global scale, while developing and optimising a model of fire frequency at annual time steps.

Methods
Non-linear parameter estimation ideally uses models that reproduce the observations with a minimum number of parameters to be estimated. We choose a simple formulation for fire frequency that is based entirely on reliable, globally available data. We formulate a model based on the assumption that wildfires require a combination of long and hot dry seasons and a continuous cover of sufficient amounts of fuel. Fire frequency is therefore assumed proportional to the product of two factors, where the first is a function of some quantity approximating fuel continuity and load, the other a function of some indicator of fire risk. The functions are formulated such that the first becomes zero at zero fuel load, the second when the fire risk is equivalent to zero.

Model formulation
Unfortunately, no spatially explicit data of fuel load is available at the global scale and we have to rely on indirect measures of fuel continuity. Because most fuel is either woody plant litter or dried grass (Stocks and Kauffman, 1997), it is co-located with the photosynthesizing vegetation it is derived from. Because low cover fraction of fuel impedes the spread of fire, an area with lower vegetation cover fraction but similar climatic conditions and human impact would be expected to have lower burned area (Spessa et al., 2005).
With vegetation cover fraction, we refer to the fractional ground area covered by vegetation, irrespective of whether the vegetation is active (in a green, photosynthesizing state), dormant or dead. However, because optical remote-sensing techniques only reliably capture vegetation with sufficient chlorophyll content (Gobron et al., 1997), we use the annual maximum FAPAR (fraction of vegetation-absorbed photosynthetically active radiation, which assumes values between 0 and 1) as an approximate measure of vegetation fractional cover. In an approximate model where vegetation consists of dense non-reflecting clumps that are sparsely distributed among exposed bare ground, FAPAR equals the shaded area of the ground at direct sunlight, a good approximation for vertically projected ground cover if the sun is less than 30 • from zenith. If FAPAR equals zero, no vegetation exists that can produce flammable plant litter. Because fires cannot spread on bare ground without fuel, maximum annual FAPAR = 0 can be assumed to lead to zero fire frequency. If maximum annual FAPAR approaches 1, vegetation cover is continuous and we assume that there is also a continuous ground cover of various fuel types, such as dead plant litter or dried grass.
In addition to using annual maximum FAPAR, we further average this value over the length of the study period of several years in order to represent an average vegetation fractional cover. We assume that this mean fractional cover is close to the mean fractional cover of fuel, which is a reasonable assumption since in the absence of fire, the turnover time of grassy or woody fuel is several years to a few decades (Weedon et al., 2009). We therefore formulate a model where fire frequency equals zero at zero long-term average FAPAR, and then increases in a non-linear fashion with FAPAR increasing towards its maximum value of one.
As the second, climate-related factor, we choose the annual maximum of the Nesterov index computed by the method of Thonicke et al. (2010), which avoids the use of climate variables that typically vary strongly with local conditions and are not widely available from meteorological station data (e.g. humidity or wind speed). The cumulative index sums daily T max · (T max − T min + 4 • C) for days with less than 3 mm of rainfall, where T max and T min are daily maximum and minimum temperature, respectively. The daily temperature range plus 4 • C is an approximate indicator of dryness because of the strong relationship between the diurnal temperature range and atmospheric humidity. The annual maximum of the index a measure of the length of the dry season under hot conditions. If either precipitation is 3 mm or higher or T max − T min < 4 • C, the index is set to zero and accumulation is started anew. The approach is similar to that used by Thonicke et al. (2001), where length of dry season is combined with simulated soil moisture. We assume that fire frequency reaches zero when the maximum Nesterov index during a given fire year is zero, but increases non-linearly with increasing annual maximum Nesterov index.
A fire year is here determined in the following way: first, we compute for each grid cell the multi-year average of the maximum Nesterov index for months January to December and choose the month with the lowest value. If the lowest value occurs more than once, we choose the earliest month within the year. The fire year is then defined as the period starting the following month. (For example, if January has on average the lowest maximum Nesterov index, the fire year is defined as February to January.) The third factor considered is human population density, and we choose a formulation that is able to capture many different functional forms with only two parameters, without any prior assumption about the direction of the impact. We achieve this by using the logistic function logit(x) = 1/(1 + e −x ), where x is a linear function of population density. This formulation allows a wide variety of possible responses, including linear, exponential, or step-like, both in increasing or declining direction (see Fig. 1). For example, if |x| is small for all cases of population densities occurring in the study area, we can approximate logit(x) ≈ 1/(1 + 1 − x) ≈ 1/2 + x/4; if x is negative and large enough for all cases, we have logit(x) ≈ exp(x); and for very large x, logit(x) approximates a step function changing from 0 for x < 0 to 1 for x > 0. Finally, replacing x by −x reverses the direction of the impact of population density on fire frequency.
The fourth factor considered is the dominant land cover type within a grid cell. The impact of land cover type on fire patterns is included through a simple multiplier. Contrary to the other parameters of the model, which are all global, this multiplier is differentiated according to the dominant land cover type.
We thus define the simple global fire model (SIMFIRE) as where f i,t is fire frequency (fractional area burned per year) for a given pixel i and fire year t, N i,t is the annual maximum Nesterov index, F i the multi-year average of the annual maximum of monthly FAPAR, and P i population density in people per km 2 . b to e are global model parameters, while a is set according to the land cover class l i at pixel i. Whenever the optimised parameter d is small, such that becomes an excellent approximation, only the product a d (l i ) = a(l i )·exp(d) can be constrained by observations, but not a(l i ) and d separately. In those cases, we use a d (l i ) as a new set of parameters replacing a(l i ) and d, and Eq. (1) takes on the form A summary of the information flow of the fire model is shown in Fig. 1.

Data sets
Monthly FAPAR data were taken for the years 2000 to 2010 using a combination of SeaWiFS and MERIS (starting April 2002) data products (Gobron et al., 2010) with a spatial As climate data we use the Max Planck Institute for Biogeochemistry WATCH ERA Interim daily global climate product at 0.5 • by 0.5 • resolution for 1999 to 2010, produced according to the method by Weedon et al. (2011). The original ERA interim climate data (available from http://data-portal.ecmwf.int/data/d/interim_full_daily) were bias corrected applying the method introduced by Piani et al. (2005), based on a fitted histogram equalisation function between the "true" and the "biased" data. Herein the fitted probability density functions at each point across time (expressed as a Gamma-distribution) are transferred to cumulative density functions, which are connected via a transfer function. This method was applied to all variables except temperature. Temperature at monthly time steps is well represented by a Gaussian distribution and can be bias corrected by a linear function, whereas the specific assumptions of the temperature histogram matching can cause large relative errors, which are addressed in detail in the daily temperature cycle section of Piani et al. (2005).
Human population density was taken from the latest available data of HYDE 3.1, related to the year 2005 (Klein- Goldewijk et al., 2010). This data set has a resolution of onesixth of a degree longitude and latitude, and was derived using Landscan population data with a resolution of 30 (Landscan, 2006). The data represent presence of humans during the day rather than place of residence, employing ancillary data such as the location of roads.
The grid of land cover classes l i (Eqs. 1 and 2) is derived by aggregating the IGBP land cover classes contained in the 0.5 • by 0.5 • version of the ISLSCP II MODIS dominant land cover type product (Friedl et al., 2010). The data set describes one dominant land cover type for each grid cell, instead of a mixture of land cover types.
4. Mixed forest: IGBP class 5 (mixed forest). Pixels assigned IGBP classes ice (11), unclassified (15) and permanent wetland (17) were excluded. We optimise SIMFIRE against three global burned-area data sets: GFED3 (Giglio et al., 2010), MODIS MCD45 (Roy et al., 2008) and L3JRC (Tansey et al., 2008). The GFED3 data are already given at a spatial resolution of 0.5 • by 0.5 • and with a monthly time step. L3JRC have a Plate-Carree projection with approximately 1 km grid spacing at the equator. This data set gives one burn date per pixel and repeated annual periods from April to the following March. The data were re-binned into monthly burned area at a resolution of 0.5 • by 0.5 • . For MCD45, we use the Geo-TIFF version produced by the University of Maryland (see http: //modis-fire.umd.edu/BA_getdata.html), which is given in an equal-area projection in 24 regional windows with a spatial resolution of 500 m. The data describe one burn date per pixel and month and have been derived from the original MCD45 data by applying temporal filtering as given by Roy et al. (2008). The filtering prevents that the same fire is counted more than once. Similar to the L3JRC product, the data are re-binned to monthly burned area at 0.5 • by 0.5 • resolution. All data sets are converted from burned area to fire frequency using the land area of each half-degree grid cell. We use data from January 2000 until December 2010 (GFED3), April 2000 until November 2010 (MCD45), and April 2000 until March 2007 (L3JRC). The MCD45 data product has a gap in June 2001, which was filled with the average burned area for June for all other years, separately for each grid cell.

Shrubland
We include an additional burned-area data set in order to assess the possible impact of small fires on the optimisation. By "small fires", we mean fires whose burn scar is too limited in spatial extent to be detected by the satellite sensors used to generate the global data sets, even though they can often be detected via thermal anomalies. Randerson et al. (2012) have developed a first algorithm to estimate the burned area caused by those small fires using the ratio of thermal anomalies within and outside of detected burn scars. Here, we use the 10 yr (2001-2010) climatology of burned area by Randerson et al. (2012) based on the GFED3 product but also including small fires. This product shows a burned area that is 36 % higher globally than GFED3. We used the ratio between the 2001-2010 climatologies of this combined data set and GFED3 at each grid cell as an adjustment factor which we apply to the GFED3 data described above. A further cap was applied to 0.21 % of cases limiting annual burned area to the area of the entire grid cell. This additional data set is denoted "GFED-SF".

Optimisation
The function f i,t (a, . . . , e) is optimised simultaneously for all grid cells against one of the four global burned-area data sets described above. One limitation of this approach is that Eq. (1) cannot represent cases with a humped relationship between f and P . Several attempts were made at optimising a form where the logistic function was applied twice with different parameters (i.e. as logit(d 1 +e 1 P i )·logit(d 2 +e 2 P i ) where d 1 , e 1 , d 2 and e 2 are control parameters), which would have allowed cases with a maximum at intermediate values. However, this failed consistently because the optimisation was not able to fit both sets of parameters independently. We deliberately did not introduce any formulation that would have pre-supposed any humped or peaked shape of the population function in order to avoid introducing prior assumptions. Instead, we chose a different method to insure that we capture cases of peaked dependences. For each burned-area data set, we carry out a series of optimisations where grid cells with a population density above a certain maximum are excluded. This maximum population density is set to infinity (global case), as well as 100, 10, 1 and 0.1 inhabitants km −2 . In case of a human-ignition limited fire regime (Guyette et al. 2002), we expect to find an increase in fire frequency with population density for cases where this variable is capped at the appropriate order of magnitude.
We also exclude areas with greater than 50 % agricultural land using data by Ramankutty and Foley (1999) for the latest year (2005) in order to minimise the impact of agricultural waste burning on the results. The number of grid cells used thus becomes 54 554 (global), 50 728 (up to 100), 39 676 (up to 10), 27 633 (up to 1) and 18 434 (up to 0.1 inhabitants km −2 ).
With four burned-area data sets to optimise against, we have a total of 20 optimisations. In addition, we perform a further set of optimisations where we increase the number of regions for which parameters a or a d are differentiated from eight (one region per land-cover grouping) to 32. Here, land-cover groupings 1 (Cropland/urban/natural vegetation mosaic), 5 (shrubland) and 6 (savanna) were further split into up to nine socio-economic regions (see Appendix B,Figs. B1 and B2). This is only done for the global cases (no limit on population density) and the data sets GFED3, MCD45 and L3JRC, bringing the total number of optimisations to 23.
We estimate the vector x, containing all model parameters, by minimising the weighted model-data misfit where i and t count grid cells and fire years, respectively, D i,t denotes observed fire frequency, σ (D i,t ) the uncertainty of the data point (observational error) and σ M the uncertainty attributed to its model (model error). In order to prevent values below zero, the parameters b and c are represented as the square of the corresponding elements of x. This results in the following mapping: , where k is the number of regions. We start the optimisations by assuming a constant uncertainty, so that the minimisation of J becomes independent of this value, and we can postpone specification of the model and data error. In practice, this amounts to minimising the simple model-data misfit The minimisation is carried out by a Levenberg-Marquhardt generalised Newton method using an analytic expression for the model's Jacobian matrix (Press et al., 2007). To investigate the dependence of the iterative minimisation procedure on the choice of the starting point, we perform a 20-member ensemble of minimisations starting from a randomly selected point in parameter space, but always with e and d initialised at 0 to insure the procedure is neutral with respect to the direction of the population effect. If at the minimum, exp(d) and exp(d + eP max ) are both > 10 (where P max is the maximum of P i over all grid cells included in the optimisation), we replace Eq. (1) by Eq.
(2) using a d as a new parameter instead of a and d, and restart the minimisation from the final point (using the dependence of a d on a and d) to assure full convergence.
After optimisation and determination of the model and data error, we estimate the posterior error covariance matrix of the model parameters by the inverse Hessian matrix of J (x). The Hessian is the matrix of the second derivatives of J with respect to the vector x (Tarantola, 2005). This requires prior determination of the model and data errors, σ and σ M . We approximate a uniform model error by the residual misfit using σ 2 M = (x opt )/(n − n p ), where x opt is the vector of model parameters at the smallest minimum found for the 20 repeated optimisations, n the number of data points entering Eq. (3) (via the sum over i and t), and n p the number of parameters. n − n p is the number of degrees of freedom of the optimisation. Because σ M turns out to be considerably larger than the uncertainty of the fire frequency, σ (D i,t ), derived from the burned-area data, (Giglio et al., 2010), we carry out the posterior error analysis using the approximation σ (D i,t ) ≈ 0 in Eq. (3). (Only the GFED3 data set contains information about observational uncertainty, but we assume that the uncertainty is of similar magnitude for the other burned-area products.)

Impact of historical population change
Based on the global sedimentary charcoal record, Marlon et al. (2008) suggest that the earth is currently in a state of low fire frequency as a result of global population growth. They report a sharp downturn since the later 19th century associated with the generally negative effects of growing human activity and population on fire occurrence since that time.
In a further application of the results of the parameter estimation, we use the optimised model's dependence on population density and apply it to historical population data from the same source (Klein-Goldewijk et al., 2010). With this model application, we substitute results derived from the spatial dependence of fire frequency on population density by a temporal dependence. Also, we specifically consider the impact of population density alone, without the possible influence of a changing climate, keeping all other factors constant. The purpose is to test whether current patterns of population density and frequency of wildfires are consistent with the hypothesis put forward by Marlon et al. (2008). We avoid the use of climate data altogether by basing our analysis on current observed fire frequency. Assuming that X(P i,t ) describes the dependence of fire frequency on population density, we compute fire frequency f i,t for the year t at grid cell i as: where f i,obs is current observed fire frequency at grid cell i (taken from each burned-area data set averaged over all available fire years), and P i,2005 population density for the year 2005 as used during optimisation.
We vary the parameters on which X(P i,t ) depends within their uncertainty range using Monte Carlo sampling (10 000member ensemble), also taking into account error covariance of the parameters. In this way, we derive median and 95 % confidence intervals of f i,t .
In the analysis described so far, we use all global burned area, including areas dominated by agriculture. The aim, however, is to simulate changes in wildfires, not agricultural waste burning. We, therefore, repeat the exercise in a way that ignores agricultural burning for those 4133 grid cells where the crop fraction is 50 % or higher and thus assess a hypothetical situation where those areas experience a wildfire regime. In order to do so, we replace f i,obs by the simulated fire frequency averaged over all fire years within January 2000 to December 2010 using the parameter vector x derived from one of the three optimisations for all population densities with eight regions. For consistency, we use the optimisation against the same burned-area data set as the one that we replace. We thus re-construct a hypothetical burned area without the presence of agriculture.

Performance of optimisation
Each of our 23 experiments (four observational data sets times five ranges of population density for eight regions plus three global optimisations for 32 regions) was repeated for 20 starting points of the iterative optimisation procedure. All 460 optimisations converged to a minimum indicated by a norm of the gradient of the simple model-data misfit (Eq. 4) below 10 −3 . The smallest minimum was found between 1 and 20 times, depending on the optimisation (see Table 1). In one case (GFED3, up to 0.1 km −2 ), the Hessian matrix of J (Eq. 3) was too close to singular to be inverted numerically, and the specific case was excluded from further analysis.
Contrary to the value of at the minimum, σ M accounts for the number of degrees of freedom of the optimisations. σ M can therefore be used as an indicator for the goodness of fit of the different optimisations (see Table 1). Comparing cases with a maximum population density of 0.1 or 1 km −2 with those using a higher value, we find that the former generally show better agreement with observations. For these low-population cases, the model fits MCD45 data best. For the cases with maximum population density of 10 km −2 or higher, it is L3JRC that shows the best model-data agreement. The rather small differences in σ M , however, are also affected by differences in temporal coverage between the burned-area data sets. With more degrees of freedom, the three 32-region optimisations achieve a slightly lower residual model-data misfit than the three other global cases. The optimisations against GFED-SF agree slightly less with the corresponding observed fire frequency, indicated by a somewhat higher value of σ M than for example the optimisations against GFED3.

Comparison with observed fire frequency
We first consider a comparison between the multi-year mean fire frequency of the optimised model and the same A cross-comparison, where simulated fire frequency of the model optimised against one data set is compared to the two other burned-area data sets, can serve as an independent validation of SIMFIRE. The first thing to observe is that the different burned-area data sets show some large differences between each other. For example, L3JRC shows much higher values for the boreal zone than the other two, but lower val-ues for the African savannas and grasslands. GFED3 and MCD45 are more similar, but are based largely on the same original satellite data from the MODIS sensor (Giglio et al., 2010). The algorithms used differ, however, which has a visible impact in some regions, in particular Australia. Here, simulations optimised against MCD45 (Fig. 3b) agree better with observations from GFED3 (Fig. 2a) than with those from MCD45 themselves (Fig. 3a). It must also be taken into account that there can be considerable differences between the global products used here and regional inventories (Chang and Song, 2009), in particular for the boreal zone in case of L3JRC (Giglio et al., 2010).
Some model-data differences, however, stand out. One is the region of the North American Great Plains, Southwest US and northern Mexico when using eight regions: it consistently shows higher modelled than observed fire frequency, with the exception of the optimisation against GFED3 (Fig. 2b) compared to observations from L3JRC (Fig. 4a). This regional mismatch is considerably improved for the 32-region cases (Fig. 5).
Another general difference is that observations are more "spotty" (i.e. show more spatial variability) than simulations, in particular in the humid tropics, for shrublands, and in the boreal zone. This feature occurs when observed fire return time (the inverse of fire frequency) is larger than the length of the observational period of no more than 10 yr (essentially all areas shown as blue, green or yellow). Because the optimisation is carried out globally, the model tries to fit a mean fire frequency over large areas in the same land cover category and with a similar climate, effectively smoothing out small-scale features of the observations. Observed and simulated fire frequency including small fires are shown for the global case in Appendix C, Fig. C1. For some regions, there is a marked change in the spatial patterns compared to GFED3 (Central Asia, Siberia, Canada) with GFED-SF appearing more "spotty" than GFED3. Some areas have rather large differences (Columbia, Argentina, SE Australia), but overall the main difference is a marked increase in fire frequency across most of the tropics and the semi-arid mid-latitudes. This is reflected in a higher simulated fire frequency in those areas.

Optimised parameters
The optimised land cover scalar, either a(l i ) or a d (l i ), shows some patterns that are consistent with the main global distribution of fire frequency previously discussed (Table 2 for global, 1 and 0.1 km −2 maximum population density). Those patterns are also consistent between data sets used for optimisation. Highest values are found for savanna/grassland and shrubland, and low values for broadleaf forest and tundra. The values for urban/crop/natural vegetation mosaic are somewhat higher than for broadleaf forest, possibly due to some remaining presence of agricultural fires. The value for barren/sparsely vegetated is mostly weakly constrained (high uncertainties) due to low burned area. There are also rather large differences in the optimal parameters between GFED3 and MCD45 on the one hand, and L3JRC on the other.

Impact of population density on fire frequency
All optimisations with a maximum population density of 1 km −2 or higher against the standard burned-area data sets yield an exponential decline of fire frequency with population, using Eq. (2) with negative values of e (Table 2, see  Table A1 of Appendix A for further cases). Because the derived relative uncertainty of e is between 1 % and 15 % (i.e. e differs from zero by much more than two standard errors), we infer that the negative impact of population density on fire frequency is highly significant. Only for the two cases with a maximum of 0.1 km −2 , e is positive, but with a much larger relative uncertainty. However, e is still significantly larger than 0, so that in this case we infer a significant positive impact of population density on fire frequency. The results for eight regions are similar to those with 32 regions (see Table A2). The optimisations against GFED-SF all yield an exponential decline of fire frequency as a function population density, even for the case up to 0.1 km −2 , and in each case, the value of e is less than zero by many standard deviations (see Table A3).
A summary of the impact of human population density is shown in Fig. 6, differentiated by the maximum population density of each optimisation. All optimisations except those with up to 0.1 inhabitants per km 2 show a decline in fire frequency with increasing population density. In these cases, Fig. 6 shows the ensemble mean and 95 % confidence interval made up of the three optimisations with their respective posterior parameter uncertainties. For better comparison, the displayed sensitivity of simulated fire  Fig. 6. Effect of human population on fire frequency as a function of population density after optimisation against satellite data for areas. The result of optimisations for grid cells with up to 0.1 people km −2 (thick lines) are shown separately for each burned-area data product. Here, only two of three optimisations were successful. Optimisations for up to 1, 10, 100 inhabitants km −2 or for all population densities are shown as ensembles containing optimisations against all three burned-area data sets used in the analysis (thin lines, of which dashed lines: median, solid lines: 95 % confidence range). All functions were normalised to 1 at 0.1 inhabitants km −2 . Dashed vertical lines denote the maximum population density for the different optimisations for better visualisation. frequency to population density is shown as a normalised function where by definition the value at 0.1 km −2 equals 1. For maximum values between 1 and 100 km −2 , the inferred dependence declines earlier than for the global case, but only the result with up to 1 km −2 shows a significant difference (marked by non-overlapping confidence ranges). The decline starts earlier when the population density was capped at lower values, but only reaches moderate reductions in fire frequency at the maximum allowed value. A considerable decline (> 50 %) is only predicted for values between 40 and 70 inhabitants km −2 .
For the cases with up to 0.1 inhabitants km −2 , only two optimisations were successful. The optimisation against GFED3 did not yield an uncertainty estimate for numerical reasons because the condition number -i.e. the ratio of the highest and lowest eigenvalues -of the Hessian at the minimum was larger than 10 15 . For the other two optimisations, the uncertainty of one optimisation is much smaller than the difference between the two. We therefore show the results of the two optimisations separately (blue and green lines in Fig. 6). Both show only a moderate decrease in fire frequency towards zero population density for the most sparsely populated areas, far short of and at much lower values than the frequently found decrease of fire density. See Appendix D Fig. D1 for the areas that enter these optimisations.

Sensitivity to historical population changes
In order to test whether our results are consistent with the view that increasing human population since the 19th century has caused a decline in fire frequency, we carry out a temporal extrapolation where we combine observed burned area from either MCD45 or L3JRC with a normalised population function as shown in Fig. 6, combining the global case with that up to 0.1 km −2 (sparse-population case), and keeping all other influencing factors constant. For the purpose of consistency testing, we also assume that the way population density is related to human impact on fire frequency remains the same across time. Grid cells dominated by agricultural fields were assigned modelled fire frequency, all others observed values (see Methods). We use where i denotes grid cell, t year, and d s and e s are parameters for the sparse-population case, and e g for the global case. From Eq. (6) wie compute the historical fire frequency as described in Sect. 2.4.
The results show a decline of fire frequency by about 14 % since 1800 (1−1/1.16 = 0.14) as a result of changes in population density, with most of the decline happening since 1950 (Fig. 7). There is a small uncertainty range, but a larger difference between the results with different burned-area data sets.
A remaining question is how the presence of agricultural waste burning affects the inferred results. As a test, we repeated the computation without replacing observed by modelled fire frequency for predominantly cropland areas. In this case, the median value of the confidence interval for MCD45 at 1800 changed from 1.137 to 1.124, and for L3JRC also at 1800 from 1.169 to 1.150. This change is much less than the difference between the results with the two different burnedarea products.

Discussion
A useful theoretical framework of fire regimes at varying stages of human settlement has been developed by Guyette et al. (2002), where three main regimes were observed: an ignition-limited one at low population density, an ignition-saturated one where fire frequency peaks, and a fuel-fragmentation regime where human activity starts to modify the landscape in a way that limits the spread of fire. Even though we were not able to optimise a peaked function directly, we can interpret the results with varying maximum population density in the light of this framework of distinct fire regimes.
As Guyette et al. (2002) note, their upper limit of 0.67 km −2 for the ignition-limited regime might be unusually high because their study area was located in a region of Missouri where lightning fires were rare and which was also dissected by numerous steep ridges and streams, thus containing many natural barriers to fire spread. Below this limit, the authors found an approximately fourfold increase in fire frequency from lowest to highest recorded population density. If there had been more lightning ignitions, the lack of ignitions by humans would have had no impact on fire frequency. In the absence of natural barriers, much larger fires would have been able to spread so that very low numbers of ignitions could have caused substantial values of fire frequency. In either case, a human-ignition limited regime would have either occurred at much lower population density, or not at all because lightning fire would have been sufficient to create an ignition-saturated fire regime.
In our global analysis based on spatial instead of temporal patterns, we only find an increase of about 20 % between the low end of population density and a threshold value of 0.1 inhabitants km −2 , where the threshold is of a similar magnitude as that found by Guyette et al. (2002). A possible explanation for the much lower increase from the lowest end of population density to the threshold is that only few fire-affected areas below the inferred average threshold of 0.1 km −2 experience an ignition-limited fire regime. This may be because lightning ignitions are common, or because many areas have few natural barriers and very few hu-man or lightning ignitions are sufficient to create an ignitionsaturated regime. As a consequence, the increase in fire frequency up to 0.1 inhabitants km −2 is small. Between 0.1 and 10 km −2 , the ignition saturated regime dominates, while above 10 inhabitants km −2 the dominant regime appears to be one of fuel-fragmentation. These thresholds are naturally not fixed values but carry considerable uncertainty and are merely indicative of the order of magnitude.
Extrapolation of our results back in time yields an estimate of 14 % for the decline in burned area since 1800, or about the same since the late 19th century. While Marlon et al. (2008) do not put any numbers on their observed changes in fire frequency, inferred changes in biomass-burning emissions based on carbon-13 isotope measurements from Antarctic ice cores and mass balance calculations can be used as a quantitative proxy. Of these, methane isotope data indicate an approximately twofold increase since 1800 (Ferretti et al., 2005), but carbon monoxide (CO) data a ∼ 70 % decline since the late 1800s (Wang et al., 2010). A further constraint is given by van der Werf et al. (2013), who used bottom-up calculations and atmospheric transport modelling to conclude that the strong decline in emissions reported by Wang et al. (2010) is difficult to reconcile with what we know about emission sources, and that emissions were likely not as high during historical periods. From this we conclude that a moderate decline in fire frequency and emissions as suggested by this study is in general agreement with other studies.
Most global fire models describe fire as a process that is explicitly limited by the number of human or lightning ignitions (Venevsky et al., 2002;Thonicke et al., 2010;Kloster et al., 2012;Li et al., 2012). In all four models cited, fire suppression is modelled exclusively via reduction of the number of ignitions at high human population density. There is no consideration allowing human population density to influence the average burned area per fire, and thus -by designno possibility of a fire-saturated regime, where the number of ignitions can increase with no impact on the total area burned (but an increase in number of fires). However, this study as others before (Archibald et al., 2009;Lehsten et al., 2010), demonstrates that fire frequency, which is proportional to the product of number of ignitions and average burned area per fire, shows very little tendency to decline when human population density goes to zero. Instead, most fire-prone regions appear to be either in an ignition-saturated or a fuelfragmentation regime. Either the very few humans present in almost uninhabited areas can cause as much burned area as humans in moderately populated areas, or lightning strikes will always fill the void in ignitions if no humans are present.
One explanation for the difference in behaviour between fire density -with a frequently observed maximum in the range of 10 to 40 inhabitants km −2 -and fire frequency is that fewer humans tend to mean larger fires. Archibald et al. (2010) found by far the largest fires in the most sparsely populated regions. Long-term observations for Canada show that lightning-caused fires had on average 2.4 times higher burned area than human-caused fires (Stocks et al., 2002) (based on lightning causing 85 % of burned area and 70 % of fires). The presence of humans in sparsely to moderately populated regions (up to ca. 10 inhabitants km −2 ) was found to cause an increase in fire number and a corresponding decrease in fire size (Syphard et al., 2007(Syphard et al., , 2009Archibald et al., 2010).
This, and the ubiquitous presence of lightning-caused fires in unpopulated areas (Stocks et al., 2002) suggest that at the global scale, a human-ignition limited fire regime (Guyette et al., 2002;Bowman et al., 2011) is rare or untypical. For the purpose of global modelling of fire, this suggests that a reasonable approach would be to assume that the spatiotemporal density of ignitions is always sufficient to start enough fires to burn the available fuel at some time (e.g. Thonicke et al., 2001), and fuel availability, climate factors and passive or active fire suppression by humans are the primary drivers of fire frequency. Alternatively, more mechanistic models should not only consider the impact of humans on the density of successful ignitions, but also on fire spread, possibly by describing fire as a process with multiple limiting factors (such as fuel availability, fuel moisture, fuel connectedness and ignitions). Even though explicit models of ignitions and human fire suppression can successfully reproduce global patterns of fire frequency (Li et al., 2012), we do not find empirical evidence that representing the processes separately is necessary in a global model.
A possible issue with the present analysis is that the burn scars of the smallest fires cannot currently be detected with the medium-resolution satellite sensors used to create currently available global burned-area data sets . If humans have the tendency to decrease the size of fires, then these global data sets might miss burned area in particular areas with higher population density. This effect would be expected to counteract the derived negative relationship between the two when the small fires are included. In fact, the inferred parameter e when optimising against GFED-SF is consistently lower compared to the GFED3 optimisation (Table 2 and Appendix A, Table A3). The difference decreases when the optimisation was constrained to lower and lower population densities and reverses for a limit of 1 inhabitant km −2 . This is in line with the assumption that the difference in e is caused by higher fire density (smaller fires) caused by more humans. However, the effect is not pronounced enough to reverse the generally negative impact of population density on fire frequency because the inferred value of e is still highly significantly below zero. It is also interesting to note that for the case with the sparsest population density and GFED-SF, the inferred relationship is still negative (but there is no result for GFED3). However, since the data by Randerson et al (2012) are produced by an indirect method and carry a considerable degree of uncertainty, we will use only the optimisations against the standard global data sets for the remaining discussion.
Based on the assumption that all main fire regimes that have occurred historically still exist on earth (Bowman et al., 2011), we have used a sensitivity analysis to infer that the current observed patterns of fire frequency and human population are consistent with the view that the dominant cause of the observed decline in global fire activity since industrialisation was the further expansion of human populations, as opposed to an earlier increase in fire activity presumably caused by changes in climate (Marlon et al., 2008). We must note, however, that this kind of space-for-time substitution has its limitations. Low population density in some areas today may be the result of land abandonment and the fire regime in such cases not comparable to the pre-industrial situation preceding population expansion. Also, phases of intensive land clearing tend to occur in tropical forests today, but in temperate forests during the previous centuries (Mouillot and Field, 2005). Nevertheless, our results indicate a general consistency between the historical decline in fire frequency since the late 19th century and present patterns of fire regimes. They do not show, however, an increase prior to this period, which is also consistent with the view that this increase was caused by climate change. Because the inferred impact of population density is small compared to the large historical changes in population density, we can expect that future climate change will have a major impact on fire frequency, even with further substantial changes in human population.
A remaining issue is that SIMFIRE with eight regions overestimates fire frequency in a number of regions, such as the shrublands and grasslands of the western US and northwestern Mexico (Figs. 2-4). A possibility is that land management in those land cover types differs substantially between major regions. Differentiating some land cover classes by socio-economic region improves model-data agreement substantially. However, the simple approach chosen here for the global scale necessarily cannot capture many of the complex interactions between socio-economic factors and fire activity. Determination of parameters of a more complex model that is able to capture some of those processes in more detail remains a significant challenge. Nonetheless, we find that even for the 32-region case, parameter e remains highly significantly below zero, leading to the same conclusion that the dominant influence of humans on the global scale is to suppress fires.
The optimisation presented here was designed in a way to capture only the explicit effect of human population density on fire frequency (Eq. 1 or 2), all other factors remaining equal. However, humans also influence vegetation cover (represented by FAPAR), creating an implicit dependence. We tried to minimise the impact of the implicit dependence on the results by excluding predominantly agricultural areas where human impact of FAPAR is expected to be largest. However, we also find that the results for a maximum density of 1 inhabitant km −2 , where human impact is expected to be minimal, are qualitatively similar to those that include areas with much higher population densities, including the global cases. If, however, simulations with minimal indirect dependence of population density on fire frequency yields qualitatively the same results as those where such indirect effects must be present, we must conclude that those indirect dependencies are unlikely to have a major impact on the results.

Conclusions
We present the first global analysis showing that the predominant effect of increasing population is to reduce fire frequency, except for extremely sparsely populated areas, where the effect is only slightly positive. Posterior uncertainty analysis and variation of remotely sensed burned area data both indicate that the results are robust. Our findings suggest that -at least to first order -wildfire should be considered a process that is not limited by ignitions, but rather as one that is profoundly modified by humans through active land management, such a deliberate burning, active fire suppression, or landscape fragmentation. Overwhelmingly, these activities appear to reduce rather than enhance fire frequency. This has consequences for the way we perceive the problem of landscape fires. For example, future climate change does not necessarily need to lead to increased fire risk be-cause of the multitude of negative impacts from human activities. Also, models aimed at simulating future fire risk should take into account both climate and demographic variables. While the exact mechanisms still need to be explored, such models should allow for the existence of ignition-saturated fire regimes.

Optimal parameters for additional cases
See Tables A1 and A2. The case with 32 regions (Table A2) is defined by a combination of the standard eight land cover classes (Fig. B1), three of which were differentiated by the nine socio-economic regions shown in Fig. B2.

Appendix B
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 1 Figure B1. Land cover classes used for the standard optimizations. Regions that were 2 excluded from the optimisation are shown in grey.