Contents - Previous - Next


This is the old United Nations University website. Visit the new site at http://unu.edu


Confounding variables and evaluation design


The validity of a research design is a measure of the degree to which its conclusions reflect real phenomena.

"Internal validity" of a design refers to the extent to which the detected outcome changes can be attributed to the intervention or treatment rather than to other causes. Unless the internal validity of a design is high, the finding that a particular relationship is causal will not be particularly convincing. Some of the major threats to the internal validity (i.e. confounding factors) are summarized in table 1.2. The primary reason for the proper choice of a comparison group and for statistical adjustment techniques is to control for these threats as best as possible when randomized allocation is not feasible. The expression "gross outcome" refers to a measured change in the outcome variable in the population without controlling for the threats to internal validity. Gross outcome does not eliminate the effects of confounding variables and therefore does not enable the evaluator to distinguish between change that occurred as a result of the programme and change that would have occurred anyway because of other factors. "Net outcome," however, does explicitly address those factors, other than the programme, that bring about measured changes in outcome variables. Net outcomes thus control for the numerous threats to internal validity.

In addition to internal validity, evaluators must be concerned with the external validity of the evaluation. External validity refers to the generalizability of the conclusions drawn to other populations, settings, and circumstances. Both the internal and external validity of an evaluation are fundamentally functions of the design chosen. Simply, each of the commonly-employed designs for evaluation displays a different ability to control for threats to internal and external validity.

While it is beyond the scope of this chapter to discuss the conventional non-experimental, quasi-experimental, and true experimental designs, and the extent to which they address confounding variables, table 1.3 diagrams six conventional designs. The reader who is unfamiliar with or uncertain about the array of available designs is urged to consult directly standard works on this subject such as those by Cook and Campbell (10), Poister (11), or Judd and Kenny (12). However, let it suffice to suggest that choosing from among the conventional techniques involves a trade-off between the difficulties of data collection, first on comparison groups and, second, over time, with the plausibility of the causal inference drawn. They also depend to some extent on the analytical capacity available, as discussed in the next section.

TABLE 1.3. Conventional Evaluation Designs

Design Referred to as Analysis Delivers
1 XO One-shot case study None Adequacy
2 OXO One-group pre-test/post-test Compare before/after Adequacy
3 Group 1 XO      
Group 2 O Static group comparison Compare groups Adequacy
4 X (Varies) O Correlational (a) Compare sub-groups Adequacy, some
    (b) Correlate treatment levels inference on net with outcome controlling for outcome those confounding variables measured which are not themselves highly correlated with treatment  
5 Group 1 OXO Non-equivalent control group design Compare groups with statistical control for confounding More plausible inferences on net outcome
Group 2 OO      
6 OOO X OOO Interrupted time series Before/after; time-series  

* X= treatment. For items 1-3, 0 = observation of outcome; for items 4-6,
O = observation of both outcome and confounding variables

 


Levels of analysis


Decisions on levels of analysis to be used are important because:

- the necessary skills for complex analyses may be lacking in developing countries;
- interpretation can be improved by advanced analyses, at least in terms of plausibility of conclusions;
- time and cost are in reality related heavily to the extent of analysis.

More advanced analyses have, on occasion, modified conclusions, and often this has been clarifying. In some cases, the clarification of further analysis avoided wrong conclusions that could, in fact, have been detected with a commonsense look at the data. For example, introducing socio-economic status into an analysis of the Narangwal experiment actually reversed the apparent direction of effect of the programme (13). Probably, however, this conclusion could still have been reached simply by dividing the sample into two or more socioeconomic groups. On the other hand, unresolved differences in conclusions between investigators using different analytical techniques have sometimes occurred because the assumptions underlying the analyses were not the same. Often the investigators did not realize this discrepancy themselves. This may warn against too much reliance on too sophisticated techniques, particularly in developing countries, especially since advanced analyses may be not widely feasible. Certainly, efforts should be made to seek the simplest analytical procedures-and this starts with the design of the evaluation.

We distinguish between "basic" and "advanced" analyses. Basic analysis refers to: categorical data analysis for comparison of frequencies (e.g. prevalences) between groups; correlation analysis, for investigating the degree of association between two variables (e.g. whether prevalence of malnutrition is correlated with a possible determinant); and analysis of variance, used to determine whether differences exist between mean values of indicators for a number of groups. The methods of advanced analysis reckoned to be most suitable for the problems we are interested in are the methods of multivariable analyses (e.g. ordinary least squares regression analysis, discriminant analysis, logic analysis, profit analysis, etc.) for investigating associations between outcome and a number of possible determinants, in this case obviously including programme delivery.

In deciding the overall plan of an evaluation, a balance needs to be struck between design, extent of data collection, level of analysis, and plausibility or certainty required of the conclusions. To some extent, good design requires less sophisticated analysis: for example, designs with adequate control groups, or before-after data, may require less investment in both data collection and analysis than an uncontrolled (by design) post-programme correlational analysis. The appropriate analyses by design are indicated in the third column of table 1.3.

When the capacity for advanced statistical analysis is not available-as may frequently be the case, particularly in poor countries-much can still be achieved by commonsense treatment of the data and by comparison of suitably-defined groups. Indeed, even when more advanced techniques are used, it is important to be clear conceptually about which groups are being considered. For example, very often socioeconomic status and/or sanitary conditions are a primary determinant of differences in outcome of variables such as nutritional status or health. These factors can confound conclusions on programme effects. Both can be measured: socio-economic status for example by income, quality of housing, etc. Analyses then are done by suitable groupings.

If programme delivery varies - even if there is no non-programme group as such - tabulation of results as in table 1.4 can be informative and valid. The interpretation of different options could be as follows: Example 1, in which the only group with poor nutritional status is that with low socio-economic status and poor programme delivery, tends to indicate that the programme is having an effect. The conclusion from this is possibly that delivery should be improved to the poor socioeconomic group. Example 2 indicates that socio-economic factors account for most of the difference in nutritional status, and that more detailed examination of whether the programme can have an effect is needed. Example 3 indicates that the programme is related to most of the differences in nutritional status. It also indicates that more efficient delivery is required because those not receiving the programme could benefit from it. Additional confounding variables such as sanitation could be added to such a table, although numbers per cell would decrease. Moreover, information may be lost by categorizing socio-economic status in this way, if it can be measured as a continuous variable.

TABLE 1.4. Comparisons of Outcomes for Different Levels of Programme Delivery and Socio-economic Status

Example

High Socio-economic Status

Low Socio-economic Status

High delivery

Low delivery

High delivery

Low delivery

1

+

+

+

_

2

+

+

_

_

3

+

-

+

_

+ means satisfactory outcome indicator values - e.g. good nutritional status
- means poor outcome indicator values - e.g. poor nutritional status

To combine several variables and make the most use of the available information, multiple regression techniques are often applied. For evaluation, the outcome (nutritional status) is the dependent variable, and programme delivery is treated as one independent variable along with other determinants (confounding variables) such as in this example of socio-economic status and sanitation. The purpose is then to examine the significance (in a statistical sense) and importance (of the magnitude of the effect) when other determinants are allowed for. It must be emphasized, however, that when the substantial computing power required for multiple regression is not available, tabulations by group, as in table 1.4, can still give important results.

It may even be possible to derive some conclusions where there is no difference in delivery but where differences in socio-economic status still exist. The possibilities are given in table 1.5. Here, Example 1 indicates that there is an inadequate effect of the programme and it should be further examined. Example 2 indicates that the programme may be having an adequate effect, although it is possible that socioeconomic status does not account for any differences. Example 3 indicates that the programme is having no effect and should be further examined or discontinued. Example 4 is, in practice, unlikely to occur. Such tabulations give useful insights into the programme adequacy, and also raise questions on targeting and delivery, as discussed in the next section.

TABLE 1.5. Comparisons of Outcomes for Different Levels of Socio-economic Status Where Programme Delivery Does Not Vary

Example

High Socio-economic Status

Low Socio-economic Status

1

+

-

2

+

+

3

-

-

4

-

+

+ and - as in table 14


Definitions of population groups involved


Both for planning and evaluation, it is important to distinguish between different population groups. The main groups of concern are as follows:

  1. the total population in the programme area;
  2. the population targeted by the programme;
  3. the population in need of better nutrition - called "needy" here;
  4. the population receiving benefits from the programme, called "recipients" here. In some cases additional sub-groups might need to be considered, e.g.:
  5. the needy population who could benefit from the programme; examples of needy who could not might be malnourished children in a supplementary feeding programme whose nutritional problem is due to malabsorption and not to inadequate food intake,

If programme staff have contact with the recipients, obtaining data on these may be relatively easy. This would be the case in the example of a feeding programme, but maybe not in, say, a water supply project. If outcome data are available from recipients these can, to a limited extent, substitute for survey data on the population as a whole.

The distinction between population groups allows construction of a series of 2 x 2 tables that lead to some important indicators for planning and evaluating targeting, as shown in Figure 1.1. (see FIG. 1.1. Construction of 2 x 2 Tables Quantifying Target Groups, "Needy," and Programme Recipients. A: Planning (pre-programme). B and C: Evaluation during programme. (When delivery is exactly as targeted, recipients = targeted, and table C is exactly like table A.)). In planning, the two important indicators are: a. the proportion of total targeted who are needy (needy targeted/total targeted), which indicates the degree of "planned focusing" of the programme towards nutrition; b. the proportion of total needy who are targeted (needy targeted/total needy), which reflects the "planned coverage" of the programme.

The concepts of coverage and focusing have commonsense meanings, both for planning and evaluation. "Coverage." a basic value that needs to be manipulated for different programme designs is equivalent to sensitivity in the epidemiological literature (14); evidently the aim is to optimize coverage. Focusing, which is equivalent to positive predictive value (14) is a less familiar concept. If targeting is to focus resources, focusing should be at least greater than the prevalence in the population as a whole. That is, the proportion of needy in the targeted population should be greater than the proportion of needy in the population as a whole; the same could apply-but is seldom to our knowledge done-for any evaluation of "poverty orientation". There are a number of procedures for choosing appropriate indicators and their screening levels to identify proportion of needy, and for efficiently deciding on cut-off points to define needy (see discussion in [14]).

For evaluation, the delivery is compared with the targeting and with degree of need, in order to generate further indicators, as shown in figure 1.1. This requires determining whether the recipients were in fact targeted and whether they are needy (e.g. malnourished).

An intermediate stage comparing targeted with recipients (part B of fig. 1.1.) gives indicators of delivery, e.g.: c. the proportion of total targeted who are recipients, which should be 100 per cent if the programme is fully implemented; and of leakage, e.g. as: d. the proportion of total recipients who are targeted, or conversely proportion of total recipients who are not targeted. These should be 100 per cent and 0 per cent respectively if there is no leakage to non-targeted groups.

If there is full implementation and no leakage, then the "actual focusing" and "actual coverage" are the same as those planned (see part C of fig. 1.1.). If there is deviation from the plan, then one way of assessing this is to calculate these "actual" indicators, comparing "needy" with "recipients." Again, actual focusing (recipients needy/all recipients) should be at least greater than the population prevalence of needy. For example, if the prevalence of malnutrition in the region served by the programme is 35 per cent, and the actual focusing is 20 per cent, the evaluator is alerted to a serious problem. Even with knowledge of costs, such indicators could give useful means of evaluating process; with costs as discussed in the next section, they could lead to decisions as to whether the programme is within the range likely to given an adequate or acceptable outcome, even if the expected effects on recipients were achieved. A worked example is given in Mason et al. (15, chap. 4).

If data on needy recipients and targeted populations are available from baseline studies, then some conclusions can also be drawn on outcome during programme implementation based only on outcome data on the recipients. This is so if the assumption can be made that the change in outcome variables is likely to be small compared with that in recipients, and if baseline (pre-programme) data are available. In this case, the need for population surveys for evaluation is reduced. This theory is also given in Mason et al. (15, chap. 4).


Effect/cost


Cost-benefit and cost-effectiveness analyses are commonly used for assessing many types of programmes, both during planning and for evaluation. In the case of food and nutrition programmes, cost-effectiveness is the more suitable approach, since a monetary figure cannot reasonably be put on outcome. This kind of analysis, however, is not often used, and a major advance in these evaluations could be made by much more systematic introduction of the techniques and thinking involved. These do not necessarily depend on accurate data, and indeed some form of cost-effectiveness thinking is implicit in the planning of almost any programme; that there is a level of expenditure per unit of expected outcome that would not be worth it is almost always in the back of someone's mind. We consider that the summary parameter of effect per unit costs (which goes to zero when there is no effect) is a useful start, and this is the one mainly discussed here.

A dose-response type of curve relating effects to cost is likely to apply to intervention programmes. This is familiar in economics (as in total product and utility curves, etc.), but not often considered for nutrition programmes. This means that the relationships show in figure 1.2. (see FIG. 1.2. Effect/Cost Curves (scale only for illustration). A: Effect. B: Effect/cost.) are likely to apply. Probably there is as yet insufficient data to put a scale on the X axis, but some research on existing data might allow hypotheses to be put forward. In this hypothetical example, a cost per head of the target population of around $13 gives the maximum cost-effectiveness calculated as number of cases prevented per thousand dollars (fig. 1.2. B); but this rate of expenditure gives less than the maximum overall effect (fig. 1.2. A). The two curves are directly related: for example at $10 per head expenditure, if 100 cases per thousand population are prevented (A), this is 100 cases per $10,000. or 10 cases per thousand dollars (B). The effect/cost in B for any value of cost per head is equal to the total effect as can be read off in A, divided by the corresponding cost per head. Put another way, the height of the curve in B at any value of cost per head is the slope of the line joining the origin to the corresponding point on the curve in A.

One important advantage of such methods would be to allow assessment of whether the level of effort in a programme is at least in the range in which an outcome effect could be expected, taking account also of the level of malnutrition in the target group. It is our impression that often a programme could reasonably be expected to have little effect because the level of expenditure is too low relative to the expected doseresponse. This idea has been referred to as "situation assessment" (see [5]).

Effects per unit cost may also be used to define the extent to which an accurate assessment of outcome is needed. For example, (using relationships similar to those in figure 1.2.) it might be postulated that a change from 20 per cent prevalence to 10 per cent prevalence after the treatment is the maximum feasible (e.g. from

200 malnourished in a population of 1,000 to 100 malnourished) at a cost of say $10 per head (i.e. $10,000 for the population of 1,000). This is equivalent to proposing an effect per unit cost of 10 cases prevented or rehabilitated per $1,000. Clearly, this should have been regarded as good value for money at the stage of planning the project. Similarly, no change would mean that effect per cost was zero. Somewhere between these two, a level of change could be set below which it was regarded that the programme's resources were not being well spent for reasons which could relate to targeting, type of activity. adequacy of delivery. etc. For example; rehabilitation of 5 cases per $1,000 could be regarded as the minimum effect/cost ratio acceptable. This means that the maximum acceptable post-programme prevalence is 15 per cent (i.e. a maximum of 150 malnourished in the population of 1,000). In this case, it is only necessary to know whether the with-programme prevalence is above or below the adequacy cut-off point of 15 per cent.


Appropriate indicators for different objectives


So far we have not defined or commented on specific potential outcome indicators, and have used nutritional status as measured by anthropometry as the general example. This was in line with our brief for contributing to the MIT workshop and with our view that the major problems lie in designs of evaluation rather than in the measurements to be taken. In addition, most of the chapters that follow are devoted to a discussion of various outcome indicators.

Nevertheless, it should be pointed out that the relationship between indicators and objectives often needs to be clarified. Sometimes the indicator precisely measures the objective. A feeding programme aims to increase the weight gain in a target population of pre-school children, and this weight gain itself is measured. In this case, the responsiveness of the indicator is equivalent to the effectiveness of the programme.

In other circumstances, the indicator is a proxy for the main objective: a feeding programme aims to increase the food intake of a target group of children but the food intake itself is not open to measurement, so anthropometry is used as a proxy for the food intake. In this case we need an indicator that responds to increased food intake: thus for example Habicht and Butz showed that height gain is more responsive than weight gain (in the statistical sense of greater significance) and therefore a better proxy for food intake (4). Such relations between the indicator and the objective needs to be established in advance.

There is an urgent need for research to establish the responsiveness characteristics of indicators. This should be done in a manner similar to table 1.6., where some relevant data were obtained to allow comparison of the responsiveness of different indicators. Although table 1.6. is unsatisfactory in that only a few indicators have been objectively evaluated, it serves to demonstrate the sort of evaluation of indicators that now needs to be undertaken much more widely to establish a firm basis for selection in the future.

Finally, the issue of sample size in relation to the choice of indicators merits careful consideration in attempts to evaluate the results of any administered treatment.


Note on sample size


Investigators must define carefully the unit of reference for which the sample size is to be estimated, clearly differentiating "observational units" from the "unit of interest" for the evaluation. The latter, which is made up of a cluster of observational units, is the principal determinant of sample size. In other words, although information may be collected from individuals (observational units), the evaluation of effects may focus and center interest on aggregates of individuals who constitute, say, families.

Whatever the "unit of interest," the number of such units (sample size) should be estimated under pre-specified conditions of accepted risk of detecting an effect when in fact it does not exist, and of not detecting the effect when it does exist. In the procedures for the statistical testing of specific hypotheses relating to treatment effects, the relative frequency (probability) of occurrence of the first kind of error is used to define the level of significance for performing the test, while the frequency of non-occurrence of the second kind of error is used to define the power of the test (frequency of correct detection of effects).

Under these premises, and provided the investigator can provide a priori information on the magnitude of the minimum treatment effect (expected result of a control-treatment difference) worth identifying, with concomitant information that relates to the variability (standard deviation) of the response under consideration, it is possible to estimate the approximate size of the sample required to detect the treatment effect (for a textbook treatment of this issue, see [9]).

 

TABLE 1.6. Mean Indicator Response to Supplementary Feeding

 

Field Trials
Type of Malnutrition Type of Analysis Indicator Age Duration
of Suppl.
Per cent Suppl. Diet Deficit rel. to std. Response To Suppl. Pooled SD Responsiveness = 1/2 (Respon/SD)
PEM Suppl Attained Wt. 36mo 36mo 17 per cent-Cal 4.5kg 0.9 kg 1.3 kg 0.24
  vs.       35 per cent-Pro (Denver)      
  Control                
    Ht 36 mo 36 mo 17 per cent Cal   2.3 cm 3 9 cm 0.17
          36 per cent-Pro        
    Arm Circum 35 mo 36 mo 17 per cent. Cal   0.35 cm 0.9 cm 0.06
          36 per cent-Pro        
    Triceps 36 mo 36 mo 17 per cent Cal   0.15 mm 1.1 mm 0.01
    Skinfold     36 per cent-Pro        
    Subscapular 36 mo 36 mo 17 per cent-Cal.   0 1.1 mm 0
    Skinfold     36 per cent-Pro        
Source : see (16)
Vit. A (1) Pre&post Serum Pre 1-2vr >100 per cent Std z 12.3 per cent 11.8 0.54
  Intervention Retinol School   Vit. A 20 mcg/dl decline in    
          reg.   prevalence    
              values <    
              20 mcg.dl    
Source: see (1)
Iron deficiency anemia Intervention Hgb (g/dl) 9 mo 6 mo 15 mg. Fe 1.21 1.07g 10.g 0.57
  vs       + 100 mg        
  Control Sat % 9 mo 6 mo Ascorbic 8.2 4.8% 60% 0.32
  Group       Acid per        
    FEP 9 mo 6 mo 100 g. full 39 26 mcg 33 mcg 0.31
    (mcg/dl. RBC)     fat milk        
          powder        
    % children 9 mo 6 mo          
    with Hgb <         27.2% 2.3% 63.9
    110 g/dl              
                   
    HgH 15 mos 9 mo (as above) 0.92 1.02 g 0.94 g 0.5 g
    Sat 15 mos 9 mo   6.7 7.2% 8 0% 0.405
    FEP 15 mos 9 mo   38 24 mcg 41 mcg. 0.17
    % children 15 mos 9 mo          
    with Hgb <         25.2% 2.0% 85.8
    110 g/dl              
Source: E Rios, et. al. forthcoming. Prevention of iron deficiency in infants by milk fortification. In Nutrition Interventions strategies, B. Underwood ed.

Clinical Trials

Type of Malnutrition Type of Analysis Indicator Age Duration
of Suppl
Per cent Supp Diet Deficit rel to std Response to Suppl. Pooled SD Responsiveness =1/2 (Respon/SD)
PEM Response VO2 max 33+97yrs 21/2 mo Protein 20 9.7 5.14 1.89
  to protein /kg (direct   (80d) from 5.6%        
  supplement measure)     - 19.6%        
          of calories        
    VO2 max 39 97 yrs 21/2mo Protein 1.49* 0.75 0.314 2.85
  difference L/min   (80d) from 5.6%%        
  between (direct     - 19.6%        
  means measure)     of calories        
    heart rate 3997vrs 124d 2240 kcal/d 55* 30 6.63 10.24
    response to a     357 kcal        
    workload of     100 Gm protein        
    250 kgM:              
    min              
* with respect to valve in general from workers of the same region (normals)
Sources
1 Barac - Niato et al Am J Cl Nutr33 2268-2275 (1980)
2 Maksud et al Eur J Appl Physid 35 173 182 (1976)
3 Spurt et al Am J Clin Nutr 32 767 778 (1979)
PEM Response Serum 18-30 22 days % not given All with PA-18mg 2.79 1.64
  to Rx prealbumin     Nido + clinicat   0.36 8.80
  clinical mg / 100 ml     Nesmida PEM      
  grp only Serum albumin     in t amt   Alb-15 kg    
    g / 100 ml              
Source Raf Ingenbleck et al 1972 Lancet a 106
PEM Normal vs serum alb 18-30 22 days + to plateau 52 8% of control 148 g /100 ml 0.38 0.18
  pre&post TBA months   of 3.5 g 285%ofcontrol 1593 mg/100ml 2.79 1.63
  PEM RBP     prot &150 31.9 % of control 3.79 mg/100ml 0.80 11.22
    plasma retinol     kcal/kg 27 4% of control 30 64 g /100ml 7.49 8.37
          BW/d        
Source: Ingenbleck et al Clin Chim Acta 63 61 (1975)
          from:        
Kwashiorkor Comparison 3rd component 6 m/o 2 wks 0.8 g prot 34 mg 22 mg 4 mg 15.13
  of pre & post of complement   6 y/o 88 kcal/kg/d        
  intervention C3 (mg / 100 ml.)   to 3.5 -4 g          
          prot and        
          140 kcal/kg/d        
          plus multi        
          vitamins        
Source: Neumann et al Am J Clin Nutr 28 89-104 (1975)
PEM Comparison %T-Iympho- Children 6-16 wks "correction of   37% 9.2% 8.1
  of pre & post ocytes in blood     deficit"        
  Intervention                
Source: Chanda Brit Med J 3 608-609 (1974)
PEM Comparison % T lymph- 1-5 y/o 50 days 1 g prof   35.7% 2.9% 75.8
  of pre & post ocytes in     100 kcal/kg/d        
  Intervention blood     4 g. prof        
          175 kcal/kg/d        
Source Kulapongs et al in Malnutrition and the Immune Response R M Suskind ed New York: Raven Press (1977), 99-103
Iron Comparison Bacterecidal 1-8 y/o 1 dose Iron   33 14 2.8
Deficiency of pre & post capacity of   paren          
Anaemia intervention PMN's leucocytes terally            
Source: Chanda Arch Dis Child 48 864 - 866 (1973)

 

The procedure for estimating sample size for comparison of independent samples utilizes the relation:

- where is an estimate of the standard deviation of the variable under consideration
- d is the difference to be detected; this should be a fraction (f) of the responses shown in table 1.6., below which one is indifferent to whether there is a response.
- K2 is a multiplier value for different levels of significance and various associated powers in testing as follows:

VALUES OF K2

Power (0/)

Level of Significance Two-tailed Test

Level of Significance One-tailed Test

 

1 per cent

5 per cent

10 per cent

1 per cent

5 per cent

10 per cent

80

117

7.9

6.2

10.0

6.2

4.5

90

14.9

10.5

8.6

13.0

8.6

6.6

95

17.8

13.0

10.8

15 8

10 8

8.6

Source: Snedecor and Cochran, (17)

In closing. it should be restated that the above procedure is only an approximation and is usually an underestimate of the required sample size. Its indiscriminate application may lead to absurd answers. It is advisable. therefore, that the question of sample size always be considered in the context of each particular situation and with proper statistical consultation. Ultimately, the successful estimation of sample size is the result of experience that bridges the realms of art and science.


References


  1. J-P. Habicht, "Assurance of Quality of the Provision of Primary Medical Care by Nonprofessionals," Social Science and Medicine, 13B (1): 67-75 (1979).
  2. R.E. Klein, M.S. Read, H.W Riecken, J.A. Brown, Jr., A. Pradilla, and C.H. Daza, eds. Evaluating the Impact of Nutrition and Health Programs (Plenum Press, New York, 1979).
  3. D E. Sahn and R.M. Pestronk. A Review of Issues in Nutrition Program Evaluation, AID Program Evaluation Discussion Paper No. 10 (USAID, Offices of Nutrition and Evaluation, Washington, D.C.. USA, 1981).
  4. J-P. Habicht and W.P. Butz. "Measurement of Health and Nutrition Effects of Large-Scale Nutrition Intervention Projects," in R.E. Klein et al., eds, Evaluating the Impact of Nutrition and Health Programs (Plenum Press, New York, 1979), pp. 133-182.
  5. H.W. Riecken, "Practice and Problems of Evaluation: A Conference Synthesis," In R.E. Klein, et. al., eds, Evaluating the Impact of Nutrition and Health Programs (Plenum Press, New York, 1979), pp. 363-86.
  6. G. Deboeck and B. Kinsey. Managing Information for Rural Development: Lessons from Eastern Africa. World Bank Staff Working Paper No. 379 (World Bank. Washington, D.C, 1980).
  7. A.A. Kielmann, C.A. Ajello, and N.S. Kielmann. "Evaluation of Nutrition Intervention Projects," Final Report to the Documentation Coordinator, TA/PPU/EUI. Technical Assistance. AID. (USAID. Washington D.C. 1980) (mimeo).
  8. G.H. Beaton and H. Ghassemi, "Supplementary Feeding Programmes for Young Children in Developing Countries " Report prepared for UNICEF and the ACC Sub-committee on Nutrition of the United Nations (1979) (Mimeo).
  9. J. Cohen, Statistical Power Analysis for the Behavioral Sciences (Academic Press, New York, 1969).
  10. T.D. Cook and D.T. Campbell, Quasi-Expenmentation (Houghton Mifflin, Boston, 1979).
  11. T.H. Poister. Public Program Analysis: Applied Research Methods (University Park Press, Baltimore. 1978).
  12. C.M. Judd and D.A. Kenny, Estimating the Effect of Social Interventions (Cambridge University Press, Cambridge, 1981),
  13. D. Chernichovsky, "The Economic Theory of the Household and Impact Measurement of Nutrition and Related Health Programmes." in R.E. Klein et. al., eds. Evaluating the Impact of Nutrition and Health Programs (Plenum Press, New York, 1979) pp. 227-267.
  14. J-P Habicht, L.D. Meyers and C Brownie, "Indicators for Identifying and Counting the Improperly Nourished," American Journal of Clinical Nutrition (in press) (1982).
  15. J.B Mason, J-P Habicht, H. Tabatabai and V. Valverde, Nutritional Surveillance (Cornell University, Ithaca, NY) (in press), (1982).
  16. R. Martorell, R.E. Klein and H. Delgado, "Improved Nutrition and Its Effects on Anthropometric Indicators of Nutritional Status." Nutrition Reports International, 21: 219-230 (1980).
  17. G.W. Snedecor and W.G. Cochran, Statistical Methods, 7th ed. (Iowa State University Press; Ames, Iowa; 1980).

Bibliography


Arroyave, G., J.R Aguilar, M Flores and M.A. Guzman, "Evaluation of Sugar Fortification with Vitamin A at the National Level," Scientific Publication No 384 (PAHO, Washington, D.C. 1979).

Beghin, l and the FAO, "Selection of Specific Nutritional Components for Agricultural and Rural Development Projects" (Nutrition Unit, Institute of Tropical Medecine, Antwerp, Belgium, 1980) (mimeo).

Casley, D. and D. Lury, A Handbook on Monitoring and Evaluation of Agricultural and Rural Development Projects (Johns Hopkins Press, Baltimore, 1982)

Davis, C.E, "The Effect of Regression to the Mean in Epidemiologic and Clinical Studies," Am. J. Epidemiol., 104: 493-498 (1976).

Drake, W.D., R.l. Miller and M. Humphrey, "Final Report: Analysis of Community-Level Nutrition Programs," Project on Analysis of Community-Level Nutrition Programs, Vol. l (USAID. Office of Nutrition, Washington D C., 1980)

Furby, L., "Interpreting Regression Toward the Mean in Development Research," Development Psychology 8: 172- 179 (1973).

Gwatkin, D.R., J.R. Wilcox and J.D. Wray, "Can Health and Nutrition Interventions Make a Difference?" Overseas Development Council Monograph No. 13 (Overseas Development Council. Washington D.C, 1980).

Saretsky, H. "The OEO P.C. Experiment and the John Henry Effect," Phi Delta Kappan, 53: 579581 (1972).

Wray, J.D. "Malnutrition is a Problem of Ecology," Bibl. Nutr. Diet., 14: 142-60 (1970).

Wray, J.D., "Twenty Questions: a Checklist for Planning and Evaluating Nutrition Programmes for Young Children," in D.B Jelliffe and E.F.P. Jelliffe, eds., Nutrition Programmes for Preschool Children (Institute Public Health, Zagreb, 1973).


Contents - Previous - Next