The long-solved problem of the best-fit straight line: application to isotopic mixing lines

It has been almost 50 years since York published an exact and general solution for the best-fit straight line to independent points with normally distributed errors in both x and y. York’s solution is highly cited in the geophysical literature but almost unknown outside of it, so that there has been no ebb in the tide of books and papers wrestling with the problem. Much of the post-1969 literature on straightline fitting has sown confusion not merely by its content but by its very existence. The optimal least-squares fit is already known; the problem is already solved. Here we introduce the non-specialist reader to York’s solution and demonstrate its application in the interesting case of the isotopic mixing line, an analytical tool widely used to determine the isotopic signature of trace gas sources for the study of biogeochemical cycles. The most commonly known linear regression methods – ordinary least-squares regression (OLS), geometric mean regression (GMR), and orthogonal distance regression (ODR) – have each been recommended as the best method for fitting isotopic mixing lines. In fact, OLS, GMR, and ODR are all special cases of York’s solution that are valid only under particular measurement conditions, and those conditions do not hold in general for isotopic mixing lines. Using Monte Carlo simulations, we quantify the biases in OLS, GMR, and ODR under various conditions and show that York’s general – and convenient – solution is always the least biased.


Introduction
A common analytical task in the physical sciences is to find the true straight-line relationship underlying independently measured points with normally distributed measurement er-rors in both the ordinate y and abscissa x.The literature on this topic is profuse, but much of it was outdated even before it was written.Looking through it, one discovers a menagerie of least-squares regression methods, none strictly appropriate to the task at hand.What one is unlikely to find, unfortunately, is the general, exact, and convenient least-squares solution published in 1969 by geophysicist Derek York (York, 1969).
York was motivated by rubidium-strontium isochrons but his landmark solution was universal.Unfortunately, while it became standard in geophysics, York's solution has remained largely unknown to the broader scientific community: York's paper has been cited almost 2000 times within the geophysical literature and only about two dozen times in all the rest of the scientific literature combined.Meanwhile, the number of papers and chapters expounding on the subject as if the solution did not exist is staggering.Examples include widely consulted books like Biometry (Sokal and Rohlf, 1995) and Numerical Recipes (Press et al., 2007) as well as articles in fields as diverse as anthropology (Smith, 2009), water resources (Hirsch and Gilroy, 1984), clinical chemistry (Stöckl et al., 1998), marine biology (Laws and Archie, 1981), aerosol science (Leng et al., 2007), and astronomy (Feigelson and Babu, 1992).
One scientific problem wanting for York's solution is the isotopic mixing line.In 1958, Charles Keeling (Keeling, 1958) introduced the idea of using an isotopic mixing line to determine the stable isotopic signature of a trace gas source.If the isotopic composition δ of a trace gas with a net source or sink is plotted vs. the inverse of its mixing ratio in air c, the points describe a straight line whose y intercept gives the desired source or sink signature.The "Keeling plot" became a key method in isotope ecology and biogeochemistry Published by Copernicus Publications on behalf of the European Geosciences Union.and is regularly used, for example, to determine the isotopic composition of respired CO 2 .Unaware of York's general least-squares solution, researchers concerned with Keeling plots have debated which of the more widely known, specialcase least-squares regression methods is best.Some studies have concluded in favor of ordinary least-squares regression (OLS) (Zobitz et al., 2006;Kayler et al., 2010), others in favor of orthogonal distance regression (ODR) (Ogée et al., 2003;Miller and Tans, 2003), and still others in favor of geometric mean regression (GMR) (Pataki et al., 2003;Miller and Tans, 2003;Kayler et al., 2010).The disagreement arises because in each of these special cases, the measurement conditions influence the fit line bias.
Here we introduce the reader to York's solution and its practical application, using the isotopic mixing line as a concrete example.We then use Monte Carlo simulations to precisely quantify the biases inherent to York's solution and to the popular special-case regression methods under various common (and some uncommon) conditions.In Appendix A, we provide a short, fast, and easily implemented algorithm for computing York's best-fit slope and intercept, as well as their associated uncertainties.

The taxonomy of straight-line fitting methods
The goal of straight-line fitting is to retrieve the true straightline relationship between two variables, x and y, from their measured values, x and ŷ (hats will denote measured values throughout this paper).Though it is common to neglect the error in x, doing so can lead to a biased fit, as both x and ŷ are measurements and therefore corrupted by error at some level.The nature of that error might vary from point to point (a situation known as heteroscedasticity), and it might also be that the error in xi is correlated with that in ŷi , where the subscript i specifies a measurement pair (i.e., a data point).Such correlation can arise, for example, if both x and ŷ are derived from the same quantity or measured by the same apparatus, and can be described by a (Pearson's) correlation coefficient r that might also vary from point to point.Finally, it is conceivable that the errors in the various xi might be correlated with one another, and similarly for the ŷi ; that is, the points might not be independent.
If the points are independent of one another and their errors are normally distributed, then the problem can be treated by least-squares estimation (LSE), which is equivalent to maximum likelihood estimation (MLE) (Myung, 2003) in this situation.Most of the literature on straight-line fitting concerns LSE, as it is appropriate to the vast majority of straight-line fitting problems.
York's solution is the general LSE solution: his equations provide the best possible, unbiased estimates of the true intercept a and slope b in all cases where the points are inde-pendent and the errors are normally distributed.We refer to this general solution as exact because no approximation is involved in its derivation, although it cannot be written analytically and must therefore be obtained in practice by numerical iteration.In contrast, the LSE methods commonly used and debated -namely OLS, ODR, and GMR -provide unbiased estimates only in very specific special cases in which the solution can be written analytically.York (1966York ( , 2004) ) considers those situations mathematically and shows that his equations reduce algebraically to OLS, GMR, etc. when appropriate.
Before discussing OLS, ODR, and GMR further, we should note that each is known by other names that add confusion to the literature (York, 1966;Hirsch and Gilroy, 1984): OLS is called "the regression of y on x", ODR is called "major axis regression", and GMR is called "reduced major axis regression"; ODR has also been called "least normal squares" and "the line of closest fit", while GMR has also been called "the line of organic correlation", "the unique solution", and "the equivalence line".The methods are also often categorized, with methods that consider error only in ŷ (e.g., OLS) being called "Model I" regressions and those that consider error in both x and ŷ (e.g., ODR, GMR) being called "Model II" regressions (Sokal and Rohlf, 1995).
OLS is by far the most widely known fitting method.The OLS fit line is unbiased only when there is negligible error in x and when the error variance for the ŷi does not vary with i.In this case, the problem reduces to minimizing the sum of the squares of the vertical distances of the points from the fit line.A variant known as weighted least-squares allows the error variance for the ŷi to vary with i.
The ODR fit line (Pearson, 1901) is unbiased only when the error variances for the xi and the ŷi are equal and independent of i and when the error in xi is uncorrelated with that in ŷi .(The technique Pearson invented to arrive at the ODR fit line is called principal components analysis, which is widely used in its own right.)In this case, the problem reduces to minimizing the sum of the squares of the perpendicular distances of the points from the fit line.However, the condition of equal error variances is almost never satisfied in reality.The excessively restrictive nature of ODR is highlighted by the fact that the ODR result is not invariant under a change of scale.In other words, if one scales the y axis by a factor of 10, the fit line slope does not scale by a factor of 10.This flaw led to the development of GMR (Kermack and Haldane, 1950), which effectively performs ODR on transformed coordinates: the xi divided by the standard deviation of the xi and the ŷi divided by the standard deviation of the ŷi (not by the standard deviations of their errors).However, GMR is also highly restrictive, as its fit line is unbiased only in the peculiar circumstance that the variance of the x error divided by the variance of the xi is equal to the variance of the ŷ error divided by the variance of the ŷi .
Neither OLS, ODR, nor GMR uses estimates of the actual measurement uncertainty in its determination of the best-fit line.
A superior fitting method, called fitexy, is provided as an algorithm in Press et al. (2007) and its earlier editions.Press et al. (2007) started (unknowingly) from a similar point as did York (1969), seeking to minimize a χ 2 function that is identical to S in Eq. (2) of York (1969) if the correlation coefficient there is set to zero.However, rather than taking advantage of York's (1966York's ( , 1969) ) algebra, Press et al. (2007) treat the problem as one of nonlinear minimization.This approach leads them to a more complicated (and slower) algorithm than one based on York's solution (such as that provided here in Appendix A).Moreover, fitexy is less general than York's solution in that it does not allow the errors in xi and ŷi to be correlated.The fitexy method has sometimes been confused with ODR in the literature (Ogée et al., 2003;Pataki et al., 2003).
When the errors in any or all of the xi and ŷi are known to be distributed non-normally, then, strictly speaking, LSE does not apply and one should retreat to the even more general formalism of MLE.However, the bias introduced into York's solution by non-normal error distributions is not always a problem, as we show in Sect.4.1.The bias will vary on a case-by-case basis, but if the correct distribution is known then the significance of the bias can be estimated fairly simply by Monte Carlo simulation, as we do here.
If LSE must be rejected, MLE may or may not be tractable.MLE requires that the correct error distributions be written analytically and that a useable expression be derived for the likelihood function L appropriate to those distributions.We say "useable" because simply applying the definition of L to the case of a straight line with errors in both x and ŷ yields an expression that requires knowledge of the true values x i and is therefore useless in and of itself.Expressing the likelihood function in a form that does not include the true values x i was essentially York's first step in deriving his solution (see Eqs. 7 through 14 of York, 1966, in which S ∝ − ln(L)).

A note on natural variability
York's method deals with the situation in which there is a linear relationship between the true values x and y but those values are measured with random error.When the scatter of the measured values x and ŷ about a straight line exceeds the uncertainty attributed to the measurement technique, the excess scatter is attributed to "natural variability".Natural variability may sometimes be describable as just another part of the measurement error, that is, as a stochastic process that intervenes between the quantity of interest and the measurement of that quantity.For example, a technique called eddy covariance is commonly used to estimate the flux of CO 2 through the two-dimensional plane overlying an ecosystem based on a single-point measurement on a tower, and most of the noise in the estimation comes not from the instrumental measurement uncertainty but from the natural (turbulencedriven) variability in the flux past that single point relative to the flux through the whole plane (Wehr et al., 2013).If the natural variability in x and the natural variability in ŷ are describable as normally distributed measurement error and can be characterized independently (along with any correlation between them), then York's method can be applied and is likely to be very useful.
However, it is often the case that the natural variability is not well characterized, or that it is not well described as additional measurement error.In this case, we argue that one cannot proceed to determine the best-fit line or even to define what "best-fit" means.In general, one can view natural variability as variation in the true x-y relationship due to the influence of other factors that are not controlled for.Thus, a Keeling plot with natural variability is like many true mixing lines all superimposed on the same plot (one line for each set of influencing factors).It is therefore pertinent to consider which true line one is looking for.To define that line is, in effect, to characterize the natural variability in x and y.
If one is interested not in the x-y relationship per se (i.e., not in an intercept or slope) but simply in predicting the most likely value of ŷ given x for the particular data that were sampled and put on the plot, then OLS is the proper fit to use.If differences among the various fit methods are not large enough to matter to the scientific question being posed, then OLS is again a sensible choice, owing to its simplicity.

The isotopic mixing line problem: Keeling and Miller-Tans plots
A Keeling plot is a scatterplot of the stable isotopic composition δ of a trace gas (typically CO 2 ) vs. the inverse of its mixing ratio c in air (Fig. 1).The isotopic composition δ is defined as a relative deviation from a reference material: where R is the ratio of the rare isotope (e.g., 13 C) abundance to the most common isotope (e.g., 12 C) abundance, and R ref is that ratio for the reference material.The standard units for δ are ‰ (parts per thousand), while those for c are ppm (parts per million).It is easy to show that in the case of a source or sink changing the trace gas mixing ratio in the atmosphere, the y intercept of a straight-line fit to a Keeling plot gives the isotopic signature (i.e., the δ) of that source or sink.To wit, if the subscripts t, 0, and s refer to the atmosphere at some time t, the atmosphere at time 0, and the source, respectively, then Multiplying the Keeling fit line equation δ = a + b(1/c) through by c gives an alternate straight-line equation, δc = b + ac, which can be fit to a plot of δc vs. c, called a Miller-Tans plot (Miller and Tans, 2003).In that case, it is the slope www.biogeosciences.net/14/17/2017/Biogeosciences, 14, 17-29, 2017 of the line that gives the source or sink signature.(Ironically, Keeling reported the inverse relationship between δ and c in 1958 but no "Keeling plots" appear in any of his early papers, nor does he discuss straight-line fitting per se; he chose instead to show curved fit lines to plots of δ vs. c.) In many ecosystems, the source-sink signatures of interest differ from one another by just 1 ‰ or less (Bowling et al., 2014), so that Keeling or Miller-Tans plot fit biases of 0.1 ‰ can be important.
The first studies to consider the effect of error in c on isotopic mixing line fits (Miller and Tans, 2003;Pataki et al., 2003;Ogée et al., 2003) advocated use of ODR or GMR on the grounds that OLS is biased by neglecting error in c; however, they quantified only the differences between OLS and the other methods rather than their true biases.Zobitz et al. (2006) then examined the true biases through Monte Carlo simulations and reported that OLS was negligibly biased for all measurement conditions and that the difference between OLS and GMR was in fact due to bias in GMR.Kayler et al. (2010) later revisited the issue and reported that OLS could be non-negligibly biased for both low (< ∼ 50 ppm) and high (> 1000 ppm) CO 2 ranges, depending on the measurement conditions, and they advocated use of GMR on a Miller-Tans plot for the most accurate fits when the CO 2 range is high.One goal of the present article is to inform readers that there is a single general fit method (York's) that is best in all measurement scenarios, so that they do not need to make a choice among biased methods or to switch between those methods depending on the conditions.Another goal is to detail the conditions under which York's general solution can be satisfactorily approximated by OLS, as we recognize that OLS will likely remain more accessible and familiar to most researchers.
Meanwhile the standard deviation of the (not really normally distributed) measurement errors in the Keeling abscissa 1/ ĉ i is given by Eqs. ( 1) and ( 2) are standard expressions for the propagation of uncertainty through the operations of multiplication and division.
For a Keeling fit, the correlation coefficient r i should be set to zero for all i (unless there is correlation between the measurement errors in δ i and c i ).For a Miller-Tans fit, r i is defined as In Eq. ( 3), cov (A, B) denotes the covariance of A and B, the angle brackets denote expectation value, and we have decomposed the measured values as ĉi = c i + c i , δi = δ i + δ i , and (δc) i = (δc) i +(δc) i , where primes denote measurement errors and unaccented variables denote true values.Note that the expectation value here is not an average over the data points, which would involve variations in the true δ and c.
It is rather the expectation value for a given data point, i.e., what the value for the data point would be if it were an average of many measurements with the true δ and c held constant.
It follows immediately from the above decompositions that and so where the second line follows because none of the variables is correlated with another.Thus the correlation coefficient is given by So for the Keeling plot, we have xi = 1/ ĉi with uncertainty θ i , ŷi = δi with uncertainty η i , and r i = 0.For the Miller-Tans plot, we have xi = ĉi with uncertainty ε i , ŷi = δi ĉi with uncertainty ϕ i , and r i = δi ε i /φ i .The approximations in Eqs. ( 1), (2), and ( 5) are usually excellent.Despite the precision afforded our Monte Carlo simulations by using 2.5 × 10 7 data points, we detect bias due to these approximations only under extreme circumstances (see Sect. 4.1).
On a modest 2009-model notebook computer, using the Igor Pro programming language (WaveMetrics, Inc.), 5000 five-iteration York fits to a 5000-point mixing line took 215 s (compare OLS at 14 s and fitexy at 1410 s), while 100 000 such fits to a 20-point mixing line took just 25 s (compare OLS at 66 s and fitexy at 150 s).For small numbers of fits, all of these methods are effectively instantaneous.It is perhaps ironic that for most real-world Keeling or Miller-Tans plots, involving fewer than 100 points, solving the York equations is actually faster than OLS.

Monte Carlo simulations
We tested the York, OLS, ODR, GMR, and fitexy methods using simulated measurements of an isotopic mixing line typical of CO 2 respiration in a forest.In this way, the true line is known and the fit bias can be assessed.Our true mixing line had a source isotopic signature of exactly −25 ‰, a background c of 380 ppm, and a background δ of −9 ‰. (A Keeling plot of this line has a slope of 6080 and a y intercept of −25.)We simulated measurements of this line for a variety of mixing ratio ranges and measurement precisions.For each set of conditions, we generated 5000 "measured" lines, each comprising 5000 points spread evenly over the mixing ratio range c, with normally distributed, uncorrelated errors added to c (with standard deviation ε) and to δ (with standard deviation η).The same values of ε and η were used for all points.We then expressed each line as a Keeling plot and as a Miller-Tans plot and fit each plot by each of the methods.
Real-world mixing line plots are not likely to contain 5000 points each, but using a large number of points per plot can be important when precisely quantifying fit bias in an ensemble of lines.A demonstration of this point is provided in Appendix B for the interested reader.Uninterested readers will be content to know that using more points per plot does not add any new bias to the results -although it might clarify existing bias from an inappropriate fit model, which is what is going on in the discussion under "Sample size effect" in Kayler et al. (2010).

Forest air measurements
In addition to our Monte Carlo simulations, we analyzed 429 Keeling plots composed of real measurements, specifically nighttime measurements of the mixing ratio and 13 C composition of CO 2 in the air within and above a forest canopy.The intercept of such a Keeling plot should give the isotopic composition of nighttime respiration.The measurements were made at each of six or seven heights on a 29 m tower every 40 or 45 min from May through October of 2011, 2012, and 2013, as described elsewhere (Wehr et al., 2013(Wehr et al., , 2016;;Wehr and Saleska, 2015).Each Keeling plot was made from one night's data and included about 50 points.The measurement precisions were ε = 0.05 ppm and η = 0.02 ‰.We also did an alternate analysis in which additional random noise of 0.2 ppm and η = 0.3 ‰ (corresponding to more common spectroscopic instrumentation) was added to the measurements.

Comparison of fit biases using Monte Carlo simulations
Isotopic source signatures retrieved from our simulated Keeling plots for c ≤ 50 ppm are reported in Table 1.The main cells contain three numbers: the upper is the York result (in bold), while the middle is the OLS result and the lower is the GMR result.We have not tabulated the fitexy or ODR results; the fitexy results are discussed below, and it is enough to say that the ODR results were no better than the GMR results (ODR being a seriously flawed precursor to GMR, as explained in Sect.2.1).The units of ‰ here (and throughout this paper) are simply the units of δ and not an indication of relative error in the results.Numbers in parentheses represent the standard error in the last digit, calculated from the distribution of retrieved values rather than by York's equations, although the two results did agree closely (see Sect. 4.3).We have omitted our Miller-Tans results from Table 1 because there were no significant differences between the Miller-Tans and Keeling results for the York and OLS methods and because the GMR results were very poor for both plot types.Indeed, as reported by Zobitz et al. (2006), GMR produces highly biased fit lines (that is, the retrieved source signature falls much more than 3 standard errors from −25 ‰).For CO 2 ranges less than 50 ppm, the GMR bias is non-negligible in practical terms (> 0.1 ‰) unless the measurement uncertainty in δ is extraordinarily low (≤ 0.01 ‰).OLS does better than GMR in most of the tested circumstances because the relative error in the c measurement is usually much less than the relative error in the δ measurement, but OLS still involves important levels of bias when ε is large or c is small.Table 2. Bias in retrieved isotopic source signatures from Keeling and Miller-Tans line fits for c ≥ 100 ppm and η = 0.2 ‰, expressed as differences from the true value of −25, in units of ‰.Each cell contains, from top to bottom, the York (in bold), the OLS, and the GMR result.1σ uncertainties in the final digit are given in parentheses, based on the standard deviation of the ensemble of 5000 fits.

ε (ppm)
CO 2 range (ppm) The York equations, however, produce unbiased Keeling and Miller-Tans fit lines for all conditions in the table.Because the emergence of high-frequency isotopic measurements is starting to raise the issue, we show in detail how some OLS-and York-retrieved isotopic source signatures compare at the lowest c in Fig. 2, where the error bars represent ±2σ , i.e., twice the standard error in the mean of 5000 fits.
Isotopic source signatures retrieved from our simulated Keeling and Miller-Tans plots for c ≥ 100 ppm are reported in Table 2. Again, the York method is by far the best, although the York fit lines do exhibit small but detectible biases for some sets of conditions here.The York Miller-Tans fits are biased by at most −0.020 ‰, while bias in the York Keeling fits is worse, reaching −0.204 ‰ when ε = 20 ppm (an exceptionally high value) and c = 100 ppm.This bias is still an order of magnitude less than the bias from any of the other methods under those conditions.The York Miller-Tans bias is due to the approximations made in Eqs.(1) and ( 5).We have confirmed this by comparing simulations with and without the approximations (a luxury not available with real data).The Keeling bias is due partly to the approximation in Eq. ( 2), but mostly to the non-normal error distribution in 1/ ĉ (see Sect. 3.1): a distribution that is asymmetric, with a nonzero mean.In Table 1, where ε (maximum 0.15 ppm) is always a small fraction of c (380 ppm), the skew of the error in 1/ ĉ is small and its effect on the fit negligible; however, in Table 2, where ε/c can be as large as 5 %, the skew becomes relatively large (the uncertainty on one side of a point is about 10 % larger than on the other side) and its effect on the fit becomes detectible in our simulations.The bias induced in the fit by the non-normal error distribution should increase as ε increases and as c decreases, which are the trends we observe.The preceding explanation is confirmed by the fact that when we alter our simulations by adding normally distributed  measurement error directly to 1/c rather than to c (and giving the correct information to the York fitting algorithm), we find that the York Keeling fits are completely unbiased (results not tabulated).Luckily, the York fit biases we report in Table 2 are very small considering the measurement uncertainties necessary to induce them and are unlikely to be the limiting source of error in any experiment.We also tested fitexy using our Monte Carlo simulations.As expected, the fitexy results were always identical to the York results when fitting to Keeling plots but were in error when fitting to Miller-Tans plots, because the latter plots involved correlation between the x and y errors.For example, with the fairly large measurement uncertainties ε = 0.2 ppm and η = 0.3 ‰, the fitexy Miller-Tans slopes were biased by −4.259 ‰ for c = 1 ppm and −0.027 ‰ for c = 10 ppm.

Comparison of fit biases using real measurements
The intercepts from OLS, GMR, and York fits to our measured Keeling plots are compared in Fig. 3, as a function of the CO 2 range.Given that our Monte Carlo simulations show the York fit to be unbiased, we can use the difference between the OLS (or GMR) and York intercepts as a proxy for the bias in OLS (or GMR).In agreement with our Monte Carlo results, Fig. 3 shows that for the original measurement uncertainties of 0.05 ppm and 0.02 ‰, GMR is negligibly biased (that is, by less than 0.1 ‰) only for CO 2 ranges above about 25 ppm, while OLS is negligibly biased for all CO 2 ranges.Also in agreement with our Monte Carlo results, the figure shows that if the measurement uncertainties are increased to the more common values of 0.2 ppm and 0.3 ‰, then GMR is never negligibly biased, while OLS is negligibly biased only for CO 2 ranges above 10 ppm.Note that the scatter in Fig. 3 is due to the fact that unlike our simulated data points, our real measured data points were not evenly distributed throughout the CO 2 range; in some of the measured Keeling plots, almost all of the points were clustered in a small portion of the range, leading to a higher bias.

4.3
Estimating the fit errors and the goodness of fit York et al. (2004) provide compact equations not only for the slope and intercept but for their standard errors as well.They  further show that these error estimates are algebraically identical to those of the more general error formulation of MLE.Note, however, that while the York equations for the slope and intercept are exact (when the measurement uncertainties are normally distributed), the York-MLE estimates of the errors in the slope and intercept are approximations (Titterington and Halliday, 1973).In Table 3, we compare York's error estimates against the standard deviations of the Keeling plot intercepts retrieved from our Monte Carlo simulations.York's error estimates agree extremely well with the Monte Carlo results except when the measurement error variances are so large as to approach the total variances in x and y (i.e., when ε/ c and η/ δ approach 1).Under those conditions, the agreement is nonetheless within 33 %.The errors estimated by fitexy (not tabulated) were slightly higher than those estimated by the York equations, tending to be farther from the Monte Carlo results for small ε/ c (when the line is well-constrained) and closer to the Monte Carlo results for large ε/ c (likely by coincidence).The York and fitexy estimated errors were of the same order of magnitude, however, and their disagreement may relate to a subtle point raised in York et al. (2004) concerning whether the errors are estimated using the original data points or what York calls the "adjusted" data points, which are the fit method's reconstruction of the true, error-free data points.
The standard error is a measure of precision; it does not speak to how well the straight-line model represents the data -a concept known as goodness of fit.York et al. (2004) note that the goodness of fit of the York solution is estimated by the quantity S/(R − 2), where R is the number of points in the fit and S is given by This goodness of fit metric is a weighted reduced chi-squared statistic, which we denote here by χ 2 W , i.e., χ 2 W = S/(R − 2).χ 2 W is essentially comparing the deviations of the points from the fit line to the assigned measurement error standard deviations.If the variables x and y are in fact related by a straight line, and if the assigned measurement errors are correct (and normally distributed), then χ 2 W will equal 1.A value of χ 2 W significantly different from 1 indicates the failure of one or both of those assumptions, where we suggest that significance be defined relative to the standard error in χ 2 W , which depends only on the number of points per fit, R, and is given by We have tested this equation by Monte Carlo simulation and found it to be correct to within the simulation precision of roughly 1 part in 200.
The right-hand column of Table 3 gives the mean value of χ 2 W for each 5000-line ensemble described in the table.With R = 5000, our σ χ equals 0.02, giving a standard error in our mean χ 2 W of 3 × 10 −4 .Even at this level of precision, χ 2 W indicates that our fits are almost always good, as expected given the nature of our simulations.However, under those conditions for which the non-normal distribution of 1/c is a source of detectible bias, i.e., when ε = 20 ppm, we see the expected deviation in χ 2 W , which reaches a low of 0.986.In practice, without many fit lines of many points each, such a small drop in χ 2 W would be undetectable, as would the associated slope and intercept biases.

Another application of York's solution: comparing two instruments
When comparing measurements of the same quantity by two different instruments, it is common to plot the values obtained by one instrument against those obtained by the other, so that the relative bias between the instruments can be determined from a straight-line fit to the plot.Monte Carlo simulations similar to those used for our isotopic mixing line example confirm that OLS and GMR may incorrectly estimate that bias.For example, if an old, unbiased instrument is being replaced by a new, also-unbiased instrument whose precision is 5 times better, and if the two instruments are compared over a trial period in which the measured quantity varies over a range that is 20 times greater than the precision of the old instrument, then OLS (GMR) will incorrectly indicate that the new instrument is biased low by 4 % (2 %) of the reading.The York equations will correctly indicate no relative bias.

Conclusion
We have shown that the general least-squares solution for the best-fit straight line, published by Derek York in 1969, is the least biased estimator of the isotopic signature of a trace gas source from a Keeling or a Miller-Tans plot, regardless of the measurement conditions.In contrast, the popular regression methods considered in the literature are unbiased only under particular, often unrealistic conditions.The isotopic mixing line illustrates the virtue and convenience of York's solution, which is applicable to line fitting problems in many scientific disciplines.We have provided a short, fast pseudo-code algorithm for computing York's solution and derived simple equations for the required measurement error and correlation parameters in the case of a Keeling or Miller-Tans plot.Being both accurate and convenient, York's solution supersedes all other least-squares straight-line fit methods.

Data availability
No original measurements were used.The forest measurements can be accessed as described in the original study (Wehr et al., 2016).The simulations used the code described in Appendix A. The black curve was obtained from 10 6 two-point plots.The red curves were obtained using the York equations on 10 6 5-point or 20-point plots.The blue curves were obtained using OLS on 10 6 5-point or 20-point plots.The heights of the curves have been adjusted independently for presentation and convey no meaning.

Figure 1 .
Figure 1.An example Keeling plot from the set described in Sect.3.3, showing 55 measurements made on the night of 25 May 2011, spanning a CO 2 range c = 52 ppm, with measurement error standard deviations of ε = 0.05 ppm and η = 0.02 ‰.

Figure 2 .
Figure 2. Mean isotopic signatures retrieved from ensembles of 5000 simulated Keeling plots (each containing 5000 points) using the York and OLS methods, for CO 2 ranges from 1 to 10 ppm and a true isotopic signature of −25 ‰.The standard deviations of the random noise added to the simulated measurements of c and δ are inset.Error bars show 2 standard errors.Note the differing scales for the ordinate.

Figure B2 .
Figure B2.Distributions of the y intercepts retrieved from Keeling plots with a true intercept of −25 ‰ (indicated by the vertical line).The black curve was obtained from 10 6 two-point plots.The red curves were obtained using the York equations on 10 6 5-point or 20-point plots.The blue curves were obtained using OLS on 10 6 5-point or 20-point plots.The heights of the curves have been adjusted independently for presentation and convey no meaning.

Table 1 .
and with ε = 1, 5, and 20 ppm.Bias in retrieved isotopic source signatures from Keeling line fits for c ≤ 50 ppm, expressed as differences from the true value of −25, in units of ‰.Each cell contains, from top to bottom, the York (in bold), the OLS, and the GMR result.1σ uncertainties in the final digit(s) are given in parentheses, based on the standard deviation of the ensemble of 5000 fits.

Table 3 .
Monte Carlo (MC) and York estimates of the error in the isotopic source signature retrieved from an individual Keeling plot, in units of ‰.The mean goodness of fit χ 2 W over each 5000-fit ensemble is also shown.
Difference between the fit intercept obtained by OLS or GMR and that obtained by York's equations, for 429 real measured Keeling plots with measurement uncertainties of 0.05 ppm and 0.02 ‰ and with CO 2 mixing ratio ranges between 2 and 335 ppm.Also shown is the difference obtained for Keeling plots in which random computer-generated noise of 0.2 ppm and 0.3 ‰ was added to the original data.The grey region represents biases that are negligible for most practical purposes (< 0.1 ‰).