Contents - Previous - Next

Chapter 5: Estimates of error

5.1 Types of error
5.2 Variance and covariance
5.3 Ratio plots and product plots
5.4 The variance of CO2 production
5.5 Optimal design to minimise the variance
5.6 Variance and the two-point method
5.7 Bias in the estimation of pool size
5.8 Precision and accuracy in the multi-point and two-point methods
5.9 Constancy of errors
5.10 Data and residual plots
5.11 References


Tim Cole
Mike Franklin
Andy Coward

5.1 Types of error

The calculations required to obtain the pool sizes and flux rates of the two isotopes have been described in Chapter 4. These quantities together with the relevant fractionation rates, lead to an estimate of the CO2 production rate in individual subjects.

To be of most value, each estimate of CO2 production rate requires an associated standard error, to show how closely the data satisfy the assumptions made by the method, and to provide reassurance that the error is small. The calculation of the error assumes that the data provide an adequate fit to the model of constant flux rate and constant pool size, so that the error is a measurement of precision. Quite separately there is the possibility of bias in the estimate, due to imperfections in the model, which defines the accuracy of the estimate.

5.2 Variance and covariance

The fundamental equation for CO2 production rate ignoring fractionation is:


where kO and kD are the flux rates of oxygen and hydrogen respectively, and NO and ND are the body water pool sizes as estimated by 18O and 2H. (Note that the use of a prime in r'CO2 signifies production rates uncorrected for fractionation - see Chapter 2). The four quantities are assumed to be constant, and are estimated from the two regression lines of log isotope enrichment versus time. So long as there are at least 3 data points the estimates have standard errors, based on the departures from linearity of the observed log enrichments. It is assumed that the timing of each sample point is known to full accuracy, although in practice this may not be strictly true.

The departures from linearity are partly physiological, representing genuine changes in flux from day to day, and partly instrument measurement error. In the multi-point method, the two sources of error are combined in the residual variation about the regression line, whereas in the two-point method there is no residual variation.

5.2.1 Slope-intercept covariance

The regression of log isotope enrichment on time can take several forms, depending on whether the errors are assumed constant in enrichment units, or constant in coefficient of variation terms, or somewhere in between (see Section 5.9 and Chapter 4). In the simplest case a constant coefficient of variation is assumed, and kO and kD are estimated as the unweighted linear regression coefficients of log enrichment versus time. Each regression coefficient has a standard error which is obtained from the regression analysis.

Similarly, estimates of IO and ID are obtained from the exponentials (or antilog) of the intercepts from these regressions, and again they have their corresponding standard errors. It is convenient to express the enrichments of 18O and 2H as the fraction of dose per mole of body water. Then the intercepts IO and ID are by definition the reciprocals of the body water pools NO and ND, ie IO = 1/NO and ID = 1/ND.

Given the standard errors for the flux rates and pool sizes, it might be thought straightforward to combine the errors, using Equation 1 to obtain a standard error for the CO2 production. However there are complications, because of the way that the four parameters interact with each other.

Firstly the estimates of the slope k and intercept log I from each regression line are highly correlated (Fig 5.1). The fitted regression line passes, by definition, through the mean value of the data points, so that the line can pivot about its mean quite independently of the latter's location. Thus the mean log enrichment and the regression slope are uncorrelated with each other. However when the slope changes the intercept changes in the same direction - the slope k and intercept log I are correlated. (It should be emphasised that k is treated here as being positive even though it is actually negative - the negative sign is included in the equations). Thus the standard error of the ratio k/I (or equivalently the product kN) needs to reflect this covariance relationship.

From Fieller's theorem 1 the variance of k/I is given by:


Figure 5.1. Schematic illustration of slope-intercept correlation

where Var(k) and Var(logI) are the variances of k and logI, and r* is the correlation between k and logI. This correlation is given by:

where is the arithmetic mean of the sampling times and is the root mean square of the sampling times (ie square the sampling times, add them together, divide by the number and take the square root). This correlation lies between zero and +1, and is a function only of the chosen sampling times, not of the measured enrichments. This takes account of the covariance between k and I for each isotope, but there is the further problem that after fitting the two regression lines, the two residuals at a particular time tend to be correlated.

5.2.2 Correlated residuals

If the observed 18O enrichment is greater than the value predicted by the regression line, there is a high probability that the 2H enrichment will also be elevated. This correlation arises in two ways. Firstly, 2H and 18O are both influenced by water turnover, but only 18O is affected by CO2 production. So if the water turnover from one sampling time to the next is anomalously high say, then the latter hydrogen enrichment is depressed. The oxygen enrichment is also depressed, but less than for the hydrogen because only a fraction of the 18O is lost as water. The converse is true for low water turnovers. So, depending on the fraction of oxygen lost as water, the deviations of 2H and 18O from their fitted lines will be correlated to a greater or lesser extent.

The second source of the correlation is physiological, in that if energy expenditure (ie CO2 production) increases sufficiently, there is likely to be a corresponding increase in water turnover through sweating. Thus both the CO2 and H2O components of 18O may rise together, depending on the precise nature of the increase in expenditure. In practice though, this source is likely to be far smaller than that due to the slope-intercept interaction.

The estimate of error for CO2 production given by Coward et al 2 fails to take either of these sources of correlation between the residuals into account. The correlation cannot be predicted on a theoretical basis, and the best method of incorporating it in the estimate of the variance of CO2 production is by no means clear.

5.3 Ratio plots and product plots

CO2 production rate can be thought of as 18O turnover adjusted for 2H turnover. Thus much of the information about CO2 production is contained in the difference between the isotope flux rates (kO-kD). This can be estimated directly from the log regression of the enrichment ratio dO/dD plotted against time. The appeal of this 'ratio plot' regression is that the errors in 2H and 18O caused by water turnover fluctuations nearly cancel each other out, so that the fit to the line is much closer. In fact the linearity of the plot is simply a measure of the constancy of CO2 production.

To make use of the ratio plot, Equation 1 defining CO2 production has to be recast in terms of the ratio plot slope and intercept, kr and Ir say. (Note that although kr = kO-kD, the relationship between Ir, IO and ID is not a simple ratio). Results from the analogous 'product plot', ie dO-dD plotted on a log scale against time, are also required - here the slope and intercept are kp and Ip.

Although the correlation between residuals is much reduced by analysing the ratio and product plots, it is not removed entirely. There may still be the correlation between water turnover and CO2 production discussed earlier, albeit on a much smaller scale.

5.4 The variance of CO2 production

The assumption of exponential disappearance implies that:


where the suffix (*) is either D or O. The corresponding equations for the ratio plot and product are, respectively:




Equation 1 can then be expressed in terms of Ir, he, Ip and kp as follows:






Using Equation 2 the variance of Equation 7 is given by:





and r* is the slope-intercept correlation given in Equation 3.

Similarly the Var (*) indicate the variances (i.e. the squares of the standard errors) for each of the parameters hr, kp, log Ir and log Ip.

Equation 10 is rather intimidating. Fortunately it can be reformulated in terms of the residual variances of the two regression lines and , and the chosen sampling times ti, where i = 1...n:


Here is the mean of the sampling times ti and St is the corrected sum of squares of the t1, where:

The equation consists of terms which depend on n, and St It is clear that if both n and St are large, then the variance of CO2 production will be small.

5.5 Optimal design to minimise the variance

Equation 13 gives the required variance, but in addition it also defines the optimal choice of sampling times to minimise the variance. For example if the mean of sampling times, , is arranged to exactly equal Ar/Br, then the first term in Equation 13 becomes zero; alternatively if t = Ap/Bp the second term vanishes.

The ratio Ar/Br is given by (NO + ND)/(kONO + kDND), which is the reciprocal of a weighted sum of the flux rates kO and kD. Similarly Ap/Bp, which is the same as (NO - ND)/(kONO - kDND), is a reciprocal of a weighted difference of the two flux rates.

The importance of Equation 13 cannot be overstated - it indicates how to design doubly-labelled water experiments optimally, providing the smallest possible variance for the CO2 production. Typical values for Ar, Br, Sr and Ap, Bp, Sp, are first required for the subjects under study, and then by suitable choice of the sampling times, n and and St can be set to minimise Equation 13.

Based on the 30 sets of data provided for the workshop, the ratio Ar/Br averages about 10 days while Ap/Bp is about -1 days. This poses a problem, for in order to make both the second and third terms in Equation 13 vanish, t needs to be equal to 10 and -1 simultaneously.

The best choice of t is a compromise between these two values, the exact choice depending on the relative sizes of (BrSr) and (BpSp). If they are similar in size, then t should be set to the average, ie 4 or 5 days. The terms Sr and Sp represent the noise about the fitted ratio and product lines, and Sr is typically a tenth of Sp (although it varies widely about this figure). However Br is very much greater than Bp, so that in practice the ratio (BrSr)/(BpSp) for the workshop datasets is close to 6. This makes the optimal choice for t equal to 7.7 days, as compared with the actual mean of 7.5 days, indicating that the experiments were close to optimally designed on the whole.

This discussion has assumed that the flux rate k is known in advance of the experiment, which of course it is not. The optimal value for t can only be chosen approximately. To ensure that the second and third terms in Equation 13 are kept small, the value of St should be as large as possible, which requires the sampling times to be widely spaced. The possible range of times is bounded in both directions of course, at time zero at one end and the time when the signal gets lost in background at the other, so that the optimal design has one group of samples near the start and another group near the end, arranged symmetrically around .

The first term of Equation 13 is inversely proportional to the reciprocal of the number n of sampling times in the experiment. In the workshop datasets, where n averages 14, almost exactly 50% of the total variance was contributed by this term. If n were to be reduced to 5 (while keeping the other variables unchanged) then the first term's contribution would be increased 3 times and the total variance doubled. Put the other way, this indicates that using 14 sampling points, as opposed to 5, halves the variance emphasising the improvement in precision that multi-point sampling provides.

Another consideration is the timing of the first sample. It is argued above that the first sample should be near to zero time, but in fact this is not a strong requirement from the precision point of view. The term St is only of secondary importance compared to n and , and there are strong practical reasons for avoiding a sample immediately after dosing. According to Equation 13, a first sample as late as 1 or even 2 days after dosing would not have much impact on the variance of CO2 production so long as is chosen appropriately. Such a delay means that the pool size is being determined through extrapolation and hence increases the importance of selecting the correct model.

5.6 Variance and the two-point method

The use of Equations 10 or 13 to calculate the variance of the CO2 production rate requires knowledge of the residual variances about the ratio and product regression lines. This precludes its application to two-point data. However it is quite valid to substitute the known instrument measurement errors into Equation 13, where the errors on 2H and 18O, dD and dO are independent so they combine vectorially. Thus:


This variance is only a fraction of the total variance, and takes no account of day-to-day variation in water and CO2 production. Thus the variance estimated from Equation 13 will be a substantial underestimate of the true variance, and for this reason it cannot be used for experimental design purposes.

5.7 Bias in the estimation of pool size

The calculation of CO2 production rate (Equation 1) assumes that the flux rates of 2H and 18O are constant, leading to linear plots of log enrichment against time. It is also valid if the plots are nonlinear so long as the body water pool size N remains constant during the experiment. Water volume is likely to be constant with adults in non-clinical situations unless they are dieting or there is serious dehydration, but in small children who are growing rapidly or in pregnant women there is a chance that the water volume will increase during the study and introduce a bias. This bias shows itself as a curved plot of log enrichment versus time.

The pool sizes, obtained from zero time enrichments, are estimated from the intercepts of the fitted regression lines. In practice the plots may not be linear, but may oscillate systematically above and below the fitted lines. This phenomenon is called serial correlation or auto-correlation. When this happens, the intercepts are biased estimates of the zero time enrichments, leading to a corresponding bias in the estimate of CO2 production.

Systematic non-linearity in the data, and the magnitude of the bias, can be seen most easily from residual plots of the data (see figures in Chapter 11). The plots for 18o and 2H show the biases for the two water spaces separately, but the biases can also be obtained from the 'product' and 'ratio' plots. The advantage of using the product: and ratio plots is that the bias is partitioned into an average pool size effect (bias in Ip) and an effect due to the ND/NO ratio (bias in Ir). When the biases in the two water spaces are covariant, as they often are, the resulting bias in CO2 production rate is contained largely in the product plot, while the ratio plot bias is far smaller.

To see how the biases in Ip and Ir relate to the bias in

r'CO2. Equation 7 can be rearranged to give:


Differentiating Equation 15 and rearranging leads to:


where dr'CO2, dIp and dIr are the fractional or percentage biases in r'CO2, Ip and Ir respectively. If dIr is zero, Equation 16 shows that dIp needs to be halved and its sign changed to give the bias r'CO2 in CO2 production rate. From the analyses performed prior to the workshop we recommend that the value for dIp is given by the residual from the product plot about 5 to 8 hours post-dose.

Similarly, the bias in Ir is shown by a series of residuals from the ratio plot between 5 and 8 hours post-dose which are consistently non-zero. As there is a lot of noise on the ratio plot, a single large residual during this time should not be treated as bias, it may equally be due to measurement error. However if bias is present (i.e. dIr is non-zero), its contribution to Equation 16 needs to be taken into account. The value of the multiplier (kONO + kDND)/(kONO - kDND) is typically 9 or a little more, so that dIr should be multiplied by 9/2 to give the bias in r'CO2.

Thus a bias of say +2% in Ip leads to a -1% bias in r'CO2, whereas a +2% bias in Ir biases r'CO2 by -9%. In practice it is difficult to detect bias in the intercept of the ratio plot. Nevertheless Equation 16 makes clear that if the ND/NO ratio is at all aberrant, its percentage effect on r'CO2 is magnified up some 4 or 5 times.

Unlike the multi-point method, the two-point method explicitly uses an early enrichment to derive the pool size. As a result it avoids the possibility of bias in the estimation of pool size. However this gain in accuracy relative to the multi-point method is offset by a corresponding loss in precision, owing to the smaller number of samples.

Chapter 9 gives further coverage of the potential errors introduced by systematic and random variations in water and CO2 flow rates, and by errors in estimating ND/NO.

Contents - Previous - Next