This is the old United Nations University website. Visit the new site at http://unu.edu
Methods for evaluating the nutritional impact of
food aid projects: lessons from past experience
Note for the record: comments on the paper
''methods for evaluating the nutritional impact of food aid
projects: lessons from past experience'' by David Sahn
Strengthening developing-country institutions
concerned with food and nutrition
Malnutrition in the 1980s: roles for the
international agencies
Nutritional status and age at secondary
sterility in rural Bangladesh
Anthropometry and nutritional status as a
function of energy intake in children 0 to 19 years old in
Bangladesh
Methods for evaluating the nutritional impact of food aid projects: lessons from past experience
David Sahn
Visiting Research Fellow, International Food Policy Research Institute, Washington, D.C., and Vice-President, Community Systems Foundation, Ann Arbor, Michigan, USA
PREFACE
The ideas and views expressed in this paper are based on discussions with the staffs of the World Food Programme and the Food-for-Peace Program and on a review of current documents (from the past three years) on efforts to evaluate the nutritional impact of food aid projects (see the list of references}. As most of the documents examined were seriously flawed, and as only two weeks were available for their review, my personal experiences as an evaluator were of great importance in reaching conclusions. Therefore, I acknowledge that the nature of the document is partially subjective. It is designed to emphasize general issues that need to be addressed in order to move forward on improving efforts to assess the nutritional impact of food aid projects.
INTRODUCTION
Food aid has undergone a transition from being surplus commodities in search of a purpose to being a valuable development resource with an opportunity cost approaching that of financial aid. This has resulted in a growing interest in evaluating the impact of specific projects administered by the World Food Programme (WFP) and the Food-for-Peace Program (US Public Law 480, Title 11). Commensurately, the past few years have witnessed an evolution in the role of evolution. The teams of experts sent out to the field to perform process-oriented, qualitative assessments have given way to a search for systematic and comparable evaluation methods that provide quantitative information on impact.
This paper will discuss the obstacles and misconceptions that have plagued the art and science of evaluation in the recent attempts to quantify the nutritional impact of food aid projects. Gaining insight from previous experience is intended to provide guidance to those who embark on future evaluations, with all the obstacles and pitfalls that task portends. This document will also address widely held preconceptions and misconceptions concerning evaluation. In doing so it will respond to the arguments of some that evaluation is purposeless and a waste of resources and of others who have become obsessed with searching for project impact.
This paper is limited to addressing experiences in evaluating nutritional impact of project aid, excluding emergency relief. Project aid includes food aid provided on a grant basis. It is used to support specific and defined activities to promote economic and social development, usually in the form of maternal and child health, food-for-work, and school feeding projects. It is distributed through differentiated market channels. Project aid is juxtaposed with programme aid, which is generally sold on a highly concessional basis and is not tied to a specific set of activities. Rather, the value of the transfer in the form of programme aid accrues to a government treasury that has a high degree of freedom in how it is spent. Attention is primarily focused on vulnerable group or supplementary feeding projects. Nutritional improvement is a nearly universal and prominent objective of these interventions.
Other types of food aid projects, however, also fall within the purview of this paper. These include the use of food as wages (i.e.., in-kind payment) to develop infrastructure and assets, as an incentive to resettle new lands and adopt new agricultural practices, and as inputs to spur agricultural development at a low cost to the government in terms of foreign exchange and budgetary support (for example, dairy or oilseed development projects). These measures all have indirect nutritional consequences, which may be positive or negative. They are mediated by the general process of economic development. They can and should be measured primarily and observed in the long term, after the active distribution of food in the community has ceased.
FRAMING THE PROBLEM
The resource of food is inextricably linked to nutrition. The result is that there is characteristically a search for nutritional impact of food aid projects, even in those that do not explicitly state nutritional improvement as one of the primary objectives. A review of food aid evaluations, however, tells a rather dismal story. Nutritional impact has rarely been substantiated. In those few instances where evaluations determine that nutritional improvement was observed, a critical review of the methodologies employed inevitably leads one to suspect the conclusions.
The obvious question is: Why have the evaluation experiences yielded unsatisfactory results, and what can be done to improve upon past performance? A review of previous and current evaluation studies of the WFP and PL 480, Title II, provides the data to answer this question. In combination they elucidate the reasons why there has been so much difficulty and controversy in evaluating nutritional impact of projects. These common problems, which are observed in a number of settings, will be expanded upon below. i must stress that not all of the issues addressed are applicable to all projects or evaluations reviewed. The purpose is to recount briefly the recurrent difficulties that plague the evaluation of nutritional impact, thereby laying a foundation for improvements in the future.
Poorly Conceived and Designed Projects
The historical role of food aid in meeting multiple objectives, coupled with the fact that most formulas for the use of food aid were developed to expedite rapid disposal with minimal financial and political costs, has conditioned the current operation of projects. The result is that there are serious deficiencies in the design and theoretical foundation of food aid projects. These explain to a great extent the reason impact evaluations have failed to successfully show marked nutritional benefits. One could make a strong argument that the scarcity of efforts to evaluate impact is appropriate, given the inadequate conceptualization of food aid projects. Just as a moral conviction prompts the impulsive reaction that handing out donated commodities must be beneficial, especially when distributed directly through differentiated market channels, a review of the design of most projects makes it equally easy to adopt the sceptics' viewpoint that nutritional impact is an unrealizable goal for most supplementary feeding projects and may at best be a faint long-term hope in other projects designed to promote economic growth and development.
The problem is most clearly manifested in the conceptual weakness of many projects that fall in the domain of targeted feeding interventions. A number of underlying assumptions remain largely unsubstantiated. They must be accepted before one can reasonably argue that supplementary feeding projects can be expected to reduce malnutrition. These include that (a) it is feasible to identify those at greatest nutritional risk in the community and subsequently encourage their regular participation; (b) through the provision of food commodities, along with efforts involving moral suasion, household nutrient availability can be increased significantly; and (c) the perception of needs within the household can be altered to encourage a new allocation of food commodities among family members.
To illustrate, the value of the transfer from most supplementary feeding projects is usually less than 5 per cent of annual total household expenditures. Empirical evidence also indicates that a 1 per cent increase in income will lead to an increase in calorie consumption of up to 0.7 per cent for the poorest of the poor. This figure tends toward 0.3 to 0.5 per cent for the more typical indigent household. Recent research has shown that income elasticities of demand for calories among the poorest of the poor households is 0.3 in Brazil (1), 0.61 in India, 0.67 in Bangladesh (2), 0.74 in Indonesia (3), and 0.71 in Sri Lanka 14). Similar figures for households near the poverty line (although not the poorest) show markedly lower elasticities of 0.35 in Bangladesh, 0.37 in Indonesia, 0.44 in India, and 0.43 in Sri Lanka. There is little doubt that the participants in food aid projects are poor and at risk. However, participation is less likely by the poorest, who are most inaccessible and hardest to reach because of where they live, time constraints, lack of education, and so forth. Thus, it is assumed that the realistic calorie elasticity figure is between 0.3 and 0.5.
Some have argued that, since the income transfer is in the form of food and typically accrues to the women heads of household, consumer behaviour is affected, thereby raising the marginal propensity to consume (MPC) calories. Arguments working in the other direction are that, to the extent that in-kind income is viewed as transitory or the value of the donated commodity is discounted because it is used for resale or bartering of goods, such transfer will actually lead to a lower MPC, and that nutrition education at the feeding centre effectively alters consumer tastes and preferences or intra-household food allocation. Despite the early research that indicated that sources of income do affect consumption behaviour (5), more recent research (6) indicates that the MPC for the food ration is not different from cash income in Panama. However, both studies acknowledge the need for more rigorous research in this area. Therefore, given the typical quantity and composition of the food ration, one must question the expectation that most food transfers will result directly in a measurable impact on net household food consumption.
It is even more uncertain that the intakes of targeted individuals (i.e., mothers and children) will be raised significantly despite increased household calorie consumption. This is especially the case for take-home feeding projects, where both sharing and substitution of the ration are likely. In addition, even if a feeding project increases household and individual caloric consumption, the link between that and measurable changes in anthropometric measurements remains tenuous (7). This is because of the complex aetiology of malnutrition. For example, utilization of the incremental nutrient intake may be low because of infection, or higher calorie intake may result in increased activity or metabolic adjustments, thereby increasing the requirement for energy. Thus, it should come as no surprise that few evaluations have proved nutritional impact.
Indeed, qualifying circumstances may raise appreciably the likelihood of observing improvements in nutritional status. First is the case when ancillary services, such as health education, encourage better sanitary practices, use of oral rehydration solution in cases of diarrhoea, and prenatal care. Past studies have indicated that such efforts can bring about impact mediated through factors such as breaking the malnutrition-infection synergism. Nevertheless, experiences to date reveal that in most circumstances adequate attention is not given to these non-food aspects of programming. Furthermore, in those cases where food acts mainly as an incentive, a whole series of other questions are raised that have yet to be asked, let alone studied. One must consider the alternatives to food aid that can animate the community and encourage participation in self-help schemes, education programmes, and other services that improve access to health care and promote better hygiene and child-feeding practices.
The second qualification is the renewed interest in targeting the household, not the individual, in feeding projects. Some recent projects have been designed to provide a transfer sufficiently large to have a meaningful impact on household expenditure levels. Evaluations have yet to examine this strategy. Nevertheless, its adoption has clear implications, not only for factors such as selecting the most cost-effective commodities, but for broader policy issues. These include (a) the wisdom of various alternative delivery systems for the transfer (e.g., ration shops, subsidized marketing costs using normal distribution channels); (b) alternative forms of assistance (e.g., the shipment of commodities in bulk, rather than the inordinately expensive packaging technique used for Title 11 and WFP commodities); and (c) the use of cash rather than a food transfer that may be discounted or have negative effects on local agricultural development. If the goal is to improve nutritional status by increasing incomes, fundamental reconsideration of how to do so is required. This should go beyond the selection of commodities and consider the distribution and administrative systems as well.
Targeted supplementary feeding projects are not alone in suffering from fundamental weaknesses in their design. Inherent conflicts in food-for-work projects, such as selecting participants to maximize utility of the food transfer and selecting participants to maximize the marginal productivity of the project are often not reconciled in the project plan. To illustrate, projects installed in regions with greater natural resources (e.g., fertile soil) or that have achieved a critical level of social and economic infrastructure (e.g., market access) will often have greater returns for investments. This is because of the synergism of project inputs with existing endowments. However, the poorest communities are in drought-prone and more isolated areas, where investments may be less economically sound. The result is that both welfare and development objectives are compromised. This is manifested in problems such as true objectives differing from those stated publicly or projects not providing the necessary complementary resources (i.e., tools, equipment, managerial support).
Planners and politicians must address explicitly these contradictions. If this is not done, confusion will inevitably hinder project implementers and functionaries. This will reduce the potential for any measurable achievement. Evaluators may also finally come to assessing project performance based on short-term welfare objectives rather than long-term economic growth.
This is not the place to resolve the issues discussed above. Rather, they are raised for three reasons. First, many evaluations of impact have not proved fruitful because of poor planning and the inadequate attention to underling assumptions that form the foundation for project design. Technical limitations of evaluation methods are often not the primary reason for the failure to capture nutritional effects.
Second, untested assumptions which form the basic logic of a project's design should be addressed through carefully planned research studies, as juxtaposed to evaluation of operational projects. Third, and most disconcerting, is that many of the design and conceptual problems of food aid projects have come to our attention as a result of previous process and impact evaluations; however, most remain to be acted upon.
Much study has focused on these ingredients for success: the imperative of integrating nutrition education, improving targeting and outreach, promoting diarrhoeal disease control as a component of supplementary feeding projects, and the necessity to increase availability of complementary inputs and managerial support for food-for-work projects.
The fact is, however, that acting upon much of the knowledge gained to date requires further commitments of resources to bolster the quality of projects. Many donors and recipients alike have been reluctant to assume the costs involved. These are mostly financial to alleviate manpower constraints and provide non-food resources. There are, however, political costs that revolve around closer co-ordination between donors and host governments. Therefore, the first item on the agenda is to make use of the knowledge that already exists concerning constraints to achieving sound project design and performance. Undobtedly the cost of doing so will extend far beyond those incurred at present. On the other hand, if the evaluations are to be taken seriously, there is an imperative for acting vigorously on their conclusions. We must not allow the call for better and more evaluation in the future to obscure the need for action in the present.
Implementation Constraints
Food aid projects are often not implemented as planned. This problem was noted in a number of countries where evaluations were performed. It is difficult to gauge the actual dimensions of this problem. Experience suggests it is large. But regardless, it is amply evident that one cannot to find nutritional impact in environments where projects are not operating effectively.
This issue of poorly functioning projects provides a number of lessons. It is a reminder that process evaluation must precede or be carried out simultaneously with the search for impact. Rather than abandon evaluation efforts altogether, it will be more fruitful to link impact studies more closely with efforts designed to improve project performance.
For example, severe logistical problems were identified in Malawi, a site chosen by the WFP for an in-depth evaluation (8). The response was first to resolve logistical problems, thereafter, procedures to determine impact would be developed and instituted. The evaluation team separated the development of a process-oriented management information system to improve project performance from impact evaluation activities. Similar recommendations have been made to WFP by others who argue that "ongoing management and operational evaluations" should no longer fall within the domain of the Evaluation Service, who should focus on "providing information important to policy formation and project design" (9). This is the wrong approach. Instead, impact indicators should be incorporated into routine data collection procedures that are part of the management process from the outset. This will have numerous advantages. It will create an opportunity to collect baseline information that will prove vital to further evaluation efforts. Similarly, a well-developed management information system will provide secondary data that can subsequently be used to evaluate projects (10-12). And most important, the development of a goal-oriented management system, in which impact data are key elements, is the best source of positive feedback on project performance, and serves as a powerful motivating force for the field staff.
Posing the Appropriate Question
An impact evaluation is only as good as the questions asked and hypotheses posed. Often objectives are ambiguous or do not follow logically from the project inputs and outputs. The result is that evaluators set out to assess the wrong type of impact. This problem is manifested by the use of inappropriate indicators for measurement and comparison. As an illustration, note the case of school feeding projects in general and the recently completed evaluation of the Jordan project in particular ( 13). The quasi-experimental evaluation study was well designed. Findings indicated that the nutritional status of children in both treatment and non-supplemented villages in Jordan deteriorated, only more so in the treatment villages.
In drawing inference from the lack of reported impact, a few important points must be considered by the reader. First is whether school feeding programmes should or can be justified on nutritional grounds and, thus, whether the evaluation was measuring goal achievement in terms of an appropriate objective. If nutritional improvement in the community is the overriding goal, inevitably resources are better allocated to women and preschool age children, who are the most vulnerable groups. Even the author of the Jordan study acknowledges that "the school children of this study were not at an age where growth can be significantly affected by a food supplement." I would further argue that not only is it of questionable validity to define school feeding objectives (and thus assess impact) in terms of nutritional objectives but, if such an approach is used, indicators other than anthropometry are required.
Justifying school feeding projects on nutritional grounds is precarious. Previous experience indicates that searching for nutritional improvement will yield little or no encouraging results, regardless of the quality of the study design. In the long run, the inadequate proof of goal achievement will reduce or eliminate support for such projects. Instead, educational objectives, such as improved attendance, enrollment, and academic performance, should be espoused for school feeding projects. Impact studies must focus on these questions. The higher goal (although not to be measured) is promoting economic development through investments in human capital.
The problem of asking the right question, which leads to evaluating impact based on the proper criteria, is perhaps nowhere more apparent than with food-for-work projects. There are two difficulties: the first is the need to distinguish between short-term and long-term impacts; the second is the inherent conflict between welfare and growth-oriented objectives of food-for-work projects.
Concerning the former issue, with the exception of the study under way currently in Bangladesh, the few attempts to examine the nutritional implications have focused on short-term effects. The appropriateness of this approach is conditioned by the second issue: the extent to which the project is oriented towards relief versus construction of infrastructure and assets that generate a stream of economic benefits. For many projects that indirectly affect nutritional status through the process of economic development, searching for nutritional impacts, especially in the short-term, is the wrong exercise.
This is well illustrated in the context of four prominent types of food-for-work (FFW) projects: (a) agricultural adjustment projects in which farmers who innovate are provided food to reduce risk and uncertainty; (b) resettlement schemes in which food, along with a number of other incentives, encourages households to migrate to new areas where land is more plentiful and/or opportunities greater; (c) projects where food serves as an input to encourage local production of milk or oilseeds; and (d) price stabilization schemes. There is little reason to expect these projects to improve nutritional status markedly during the short-term when the food assistance is actually being distributed. A combination of factors reduce the likelihood of improvement. These include the following circumstances: (a) the expected stream of returns on investment does not begin to flow immediately; (b) the poorest of the poor are not likely to be recipients (especially in projects designed to construct viable assets and infrastructures); (c) participation is often irregular and intermittent; (d) the income is considered transitory, thereby causing a low marginal propensity to consume; and (e) recipients are often drawn from other low-productivity areas of employment and are rarely completely idle.
Furthermore, it is likely that in the absence of the project, other adjustments will take place to maintain food consumption levels. They include {a} the sale of assets (e.g., livestock), (b) consumption of inferior commodities (e.g., an increase in consumption of famine foods such as wild tubers), (c) consumption of seeds earmarked initially for the next planting, and (d) lower levels of physical activity (especially of the household head) that reduce household demand for energy (as well as possibly alter intra-household allocation in favour of children or women).
In general, then, the practice of assessing the nutritional impact of food-for-work projects that are designed to promote economic or agricultural development should not focus on the short-term. This has been proved by previous experiences. As remarked in a recent review, none of the studies examined that were undertaken during the construction (food distribution) phase "have shown that a FFW project resulted in nutritional improvement in the participants or their families" (14). While this partially reflects the gross deficiencies in the study designs employed, even more compelling is the argument that the goal of such development projects is longer-term in nature, and thus evaluations should reflect that perspective.
Before examining the issue of nutritional impacts over the longer term when the return from investment is expected, one must first acknowledge that FFW is often initiated as a relief mechanism to cope with events such as seasonal hunger or critical shortfalls due to crop failures and man-made disasters. Even for these projects, however, it is of questionable value to judge project achievement (i.e., welfare) on the basis of anthropometric or other more complex nutritional indicators (immunocompetence, biochemical tests).
Rather, in order to assess the short-term welfare impact of relief-oriented food-for-work projects, more direct indicators, such as prices of staple foods and their rate of disappearance from the market, or consumption data from a sample that examines the levels of intake of traditional famine foods or the mix of starchy staples in at-risk households, may prove more appropriate. They will be more responsive to the issue at hand- whether the project is reaching intended relief objectives- than the collection of nutritional status data that reflect a ,phenomenon of complex and multiple aetiology.
In examining the long-term (during the operational phase) growth and distribution effects of FFW, a number of questions arise as to the appropriateness of measuring nutritional impact. Other than a well-designed study under way currently in Bangladesh (15), there are virtually no good experiences to draw upon where nutritional changes are measured. This is no surprise. The data requirements and rigours of such a study preclude widespread application of the methodology. This suggests that attempts should be made ex ante to assess the nutritional consequences of FFW, as in all development projects. Estimating the effects on key variables such as commodity prices, incomes, and the ratio of home consumption to marketing, disaggregated by a functional classification that distinguishes the nutritionally vulnerable households, should be a normal part of the project appraisal process. Figure 1 presents a scheme for doing so. Integrating nutritional concerns into FFW projects need not become a burdensome and highly quantitative exercise. Rather, it will generally suffice to simply understand the direction of anticipated changes in prices and incomes. Thereafter, qualitative judgements can be made as to whether a project will improve or represent a hazard to the nutritional status of at-risk groups.
FIG. 1. Conceptual Diagram for Modelling Nutritional Consequences of Policies or Projects
(Source: D. E. Sahn and C. P. Timmer 1984, based on work by Anderson 1981)
Effective food-for-work projects will indirectly raise nutritional levels of the poor. The hallmarks of such projects are increased food availability that moderates food prices and higher income among the poor. In a few cases these objectives may be achieved within a single agricultural cycle. In other cases the time horizon will be longer. In either case, I take exception to the recommendation that nutritional status is an appropriate indicator for standard of living because its measurement "is relatively straightforward and unambiguous" (14). The truth is that there are too many factors that reduce the likelihood that a food-for-work project will translate into traditional indicators of nutritional improvement. Although the occasional research project may determine whether a project results in expected nutritional outcomes, emphasis should be placed on formulating a project design that seems likely to increase net food consumption of vulnerable groups. Thereafter policy-makers should concentrate on the myriad of operational problems that have been characteristic of food-for-work schemes.
A review of food aid evaluations provides several instances, other than the school feeding and FFW examples cited above, where questions concerning impact are formulated incorrectly- for example, the use of food as a rehabilitative tool or to prevent malnutrition at the first sign of growth faltering. The food is only distributed to children falling beneath a cut-off point of a reference growth curve or to children who fail to gain weight for two or three consecutive months. Other services are also provided. These include intensive educational efforts (e.g., to improve feeding practices and hygiene; to encourage the use of oral rehydration therapy) and such activities as providing resources for home gardens. The intent is not just to rehabilitate the malnourished child, but to prevent reoccurrences. However, the evaluations reviewed did not examine whether fundamental changes in knowledge or behaviour had any impact on preventing children from becoming severely malnourished so that rehabilitation feeding in the form of food aid becomes obsolete.
These problems reinforce the need to define objectives clearly, based on a logical project design. If this is done, evaluations will undoubtedly respond accordingly.
Selecting the Correct Indicators and Performing Accurate Measurement
Measuring nutritional status is a complicated undertaking. No single measure is completely sensitive and specific to the distinction between well-nourished and malnourished. Selecting the correct indicators and collecting data accurately add considerable complexity to evaluating nutritional impacts. The fact is that the relationship among indicators, and between indicators and changes in nutritional status, was not addressed by many evaluators. The implication is that, even if improvements do take place in nutritional status, they may not be detected.
To illustrate, consider the following examples of the limitations of indicators often employed (16):
- Arm circumference does not respond in the short or medium term to changes in nutritional status. This measurement will therefore probably not reflect improvements due to an intervention.
- Weight-for-age, which is a composite of stunting and wasting, may be low due to deficits incurred years previously and not to present status. Children may be misclassified as malnourished even if their status has improved, since evidence exists that chronic malnutrition during certain susceptible time periods may result in stunting without subsequent catch-up growth (17).
- Weight-for-height measurements are not sensitive to improvements in mildly or moderately malnourished children. A normal ratio of these measurements will be maintained by reduced activity and metabolic adjustments until the child is severely deficient in intake (18).
- Little is known about the dose response of increased caloric intake and how this will be manifested in terms of improvement in growth indicators. Lack of growth response, despite increased energy intake, may be explained by increases in the level of physical activity and metabolic rate (19).
Without going into further detail, the above illustrates that the wrong indicator will result in failure to capture the benefit of a programme. In the future, more careful consideration must be given to identifying the nature and extent of the nutrition problem, the expected scope for improvement, and which indicators would best capture this improvement.
Similarly, there is evidence that random errors in field measurements may have serious implications for the findings of an evaluation. If children in both control and treatment groups are randomly misclassified due to inaccurate field measurement, in most cases this will result in a low estimate of the difference in rates of malnutrition between the two groups (20). This further reinforces the importance of using good measuring equipment and training procedures.
A final methodological issue that has limited the usefulness of evaluations reviewed is that analytic procedures such as choice of standards (anthropometric or dietary), cut-off points (below which one is considered underfed or malnourished), and statistical techniques (e.g., the use of percentiles of median values versus Z-score cut-off points) are different from one study to the next and often not based on sound rationale. The implications of these choices for the prevalence of malnutrition are enormous. For example, Drake et al. (21) showed that almost twice the number of children were classified as second- and third-degree malnourished when the Harvard, rather than the Gomez, growth standards were applied to a population in Brazil. Similarly, Sahn (22) found that, even when employing the same growth standard (23), the percentage of children classified as wasted in Sri Lanka was almost twice as high when a -2 Z-score cutoff was used, rather than the traditional 80 per cent of the median weight-for-height cut-off point. As a consequence of these differences in standards and cut-off points, comparisons between impact studies are spurious. Therefore, it is urged that agencies move towards standardizing analytical procedures.
Research Design and the Difficulty of Attribution
One further obstacle is often observed in many evaluations. This concerns documenting change and comparing that change to what is expected in the absence of the project There are a number of factors that characteristically impede the attribution of improvements in nutritional status to project activities. Such confounding factors can be negative or positive. If negative, successful interventions seem ineffectual; if positive, unsuccessful interventions appear to have improved nutritional status. Thus, the wide range of competing explanations (including the project) for observed changes in nutrition often result in indeterminate evaluation results.
The most predictable competitor with a project as an explanation for nutritional changes, which was overlooked in many studies reviewed, is the phenomenon of population aging. Rates of malnutrition, according to anthropometric indicators, vary markedly by age. Failure to take this into account (i.e., controlling for the age distribution of the population and limiting comparisons to six-month groupings) will lead to spurious results.
A second phenomenon observed in malnourished populations concerns the contention that those with the worst nutritional status benefit most from a programme. This statement was nearly universal in the evalutions reviewed. However, naturally occurring changes in populations, including regression toward the mean, can once again explain such findings. That is, there is a spontaneous movement of individuals who fall at extremes of a measurement scale (e.g., grade III malnourished) to regress towards the population mean. This is logical. If you have grade III malnourished children, they either stay the same or improve (or alternatively die, which in most cases means that they are no longer part of the sample). On the other hand, you would expect a certain number of grade II malnourished children to improve and the remainder to worsen, becoming grade III.
A variety of other confounding influences were often not accounted for in the evaluations reviewed. For example, there was a failure to consider the addition or attrition (i.e., in or out-migration) of participants that may have altered the composition of a treatment or matched control group. Furthermore, most evaluations reviewed did not consider the social and economic factors or physical phenomena that are competing explanations for nutritional changes. The complexity and volatility of environments in which programmes are operating require that consideration be given to these historical and environmental factors. Unfortunately, when outside evaluators enter unfamiliar environments and have little knowledge of what secular events transpired during the course of the intervention, it is difficult to account for such confounding factors.
There are numerous other competing explanations and threats to internal and external validity not addressed above. The literature discusses these issues and cautions the reader to be sceptical of evaluation findings that stem from poorly conceived methodologies (24).The question is what to do, given these problems.
In an ideal world, future impact evaluations would employ classical experimental design characterized by randomization in selecting treatment and control groups. Doing so would leave little doubt that observed differences were attributable to the project. Unfortunately, an experimental research protocol in the context of operational projects is all but impossible. Reasons include: (a) programme planners and staff may resist randomization as a means of allocating treatments, arguing for some other criterion, e.g., need or merit; (b) the randomization process is difficult to carry out correctly in highly dynamic environments, resulting in non-equivalent (test and control) groups; (c) there is a high likelihood of spill-over effects from the treatment to control population; and, most persuasive, (d) the expense of running good experiments precludes their use on a wide scale.
Quasi-experimental or non-experimental evaluation designs are therefore relied upon. Most popular among these are:
- the one group pre-test/post-test design, in which initial measurement is performed on the population, followed by the delivery of services and post-programme measurement to see what changes occurred;
- the static group comparison design, in which a project is initiated and, at some time after the programme has been operational, measurements are taken on the treatment group and on other similar populations not receiving services and are then compared;
- the non-equivalent control groups design, in which matched treatment and control groups are both selected and measured prior to the beginning of the project, and then have their nutritional status reassessed at some time after the project has been operational;
- the recurrent institutional cycle design, in which the nutritional status of children who have been participating in the project for a meaningful length of time is compared with that of individuals who have recently enrolled.
The evaluator should be familiar with the strengths and weaknesses of these and all other possible options. With all these techniques it is also plausible to use multivariate statistical techniques (e.g., regression analysis) to control for non-project variables that may be important determinants of nutritional status.
Quasi-experimental designs, however, have met with only limited success in controlling for negative and positive confounding variables. This reflects partially the inherent limitations of the methods. However, there is considerable room for improvement in the application of these techniques. It is imperative that future evaluation studies (a) should place greater emphasis on reducing the extent to which there are competing explanations for relating the delivery of services to nutritional outcomes (i.e., on minimizing the threats to validity), and (b) should acknowledge and discuss candidly the many biases that creep into analytic procedures. Thereafter biases can be dealt with in a qualitative fashion by involving those individuals most familiar with and knowledgeable about the project and the local environment in the interpretation of the data. An example of adhering to these principles is found in the evaluation of the weaning foods project in Sri Lanka (10). Following the data analysis, the outside evaluators engaged local workers in a dialogue. The original interpretations of the data were altered significantly in response to discussion with those on site.
The implications of these principles do not militate against attempting to measure the nutritional impact. They do, however, warn against hastily conceived methodologies used in the past that may grossly over- or under-estimate changes in nutritional status.