This is the old United Nations University website. Visit the new site at http://unu.edu

#### Stochastic process models

Many food-related activities can be conceptualized as dynamic processes that occur over time. Jerome (1975), for example, has recently called attention to the importance of foodconsumption cycles and rhythms and has described the processes by which foods become incorporated into the dietary patterns of urban Americans. Food procurement, processing, and distribution activities should also lend themselves to processual analysis.

Mathematics is very helpful for representing and analysing processes. Deterministic mathematical models abstract the structure of a process and specify exact outcomes. Stochastic mathematical models represent processes as a set of outcomes occurring randomly according to a set of associated probabilities. Because the theory and data concerning human activities that can be seen as processes seldom allow us to state exactly what will happen, it is reasonable to assume that stochastic process models are more realistic than deterministic models. Thus, we focus exclusively on stochastic models here.

Stochastic processes comprise a vast field of mathematical statistics, and we will restrict our attention to only two, which in their most elementary form are rather easy to grasp and have been widely used in anthropology and other behavioural sciences. These are the finite Markov chain and Poisson processes.

Each stochastic process is derived from a particular set of assumptions. The choice of any one to model a real-life situation should, therefore, be justified, in so far as possible, by whether or not these assumptions appear reasonable in the light of the empirical problem at hand and/or substantive theory regarding it. If it can be established that a stochastic process model fits an empirical process, then the assumptions upon which the model is based can be said to characterize the empirical process. The model can then be used to extrapolate probability distributions of expected occurrences.

Markov Chains

A finite Markov chain is a stochastic process consisting of a finite set of states (outcomes), {s1, s2, . . ., Sn}, and an associated set of transition probabilities, {pij}, such that the conditional probability of outcome sj of an experiment or trial, given a previous outcome of si, is pij. The states are mutually exclusive, so an element in the process, such as a person or an object, can be in one, and only one, state at one time (or after one trial). All elements in the same state are assumed to have the same probability of remaining in, or changing, state after each trial (homogeneity assumption). The transition probabilities {pij} remain constant throughout the duration of the process (stationarity assumption). And the conditional probability of outcome sj depends at most on the immediately prior outcome si (onestep dependency assumption). Considered together, these assumptions characterize a finite Markov chain process. When these assumptions appear reasonably true of sequential phenomena, then a Markov chain model is worth considering. A number of excellent works can be consulted for detailed explanations of finite Markov chains and their potential uses (Kemeny and Snell, 1960; Bartholomew, 1973). White (1974) describes many uses in anthropology. Here we will flesh out this skeletal description of Markov chains with an empirical illustration.

Example 8

In work on food-consumption patterns and preferences of middle-income Americans, we collected daily records over a six-week period of the major type of meat consumed at evening meals by three female and three male adults. Our objectives are to: (a) estimate the relative frequencies or probabilities of meats used over various time-periods; (b) extrapolate the relative proportion of meats used over the long run (indefinite future); and (c) extrapolate meat-use cycles (expected time for re-use of the same meat type, and time from use of one type to another). Each person made a daily record of one of the following meat types he/she ate the most: (a) beef, (b) pork, (c) poultry, (d) seafood, and (e) other (e.g. variety meats like "cold cuts" or no meat consumption at all).

A discrete-state finite Markov chain was selected to model meat-use sequences because the number of states is finite, given by the five meat types, and the states are mutually exclusive. A person can be in one, and only one, of the states at a given time. This is true by definition (i.e. a person can have only one meat type most often). Other reasons for selecting this kind of Markov chain are that: (a) observations at more than one time interval (or over more than one daily trial) are available; (b) it is assumed that persons in the same state (i.e. eating the same meat) have the same probability of remaining in or changing state; (c) it is assumed that the meat type a person has at time t + 1 depends at most on the type he/she had at time t; and finally (d) it is assumed that these transition probabilities will remain constant for the duration of the process considered. We also assume, for purposes of illustration, that the number of observations are sufficient to estimate accurately the transition probabilities.

While it certainly can be argued that these assumptions are tenuous, we shall deem them sufficiently reasonable to merit exploration within the context of this example. We will attempt to evaluate the goodness of fit of the Markov chain model by seeing how well it approximates reality by using the data from the first three-week period to construct it. Extrapolations from the model will then be compared with the actual observations over the second three-week period.

Table 5 presents a matrix, F. of 126 transition frequencies of meat types for the six respondents over the first 21-day period. Each state is labelled along the rows and columns and refers to a meat type. Each fij element in the matrix denotes the frequency. One "state" is followed by another (including, on the main diagonal, the same state). Thus, the 20 in the first row and column refers to the number of times beef was followed by beef on the succeeding day. The 14 in row 17 column 2 refers to the number of times beef was followed by pork on the succeeding day, and so on. The other rows are interpreted similarly. Since each person's sequence starts with the meat type used on the day the observations began, we classified that meat type as following the modal meat type it follows in the first three-week period.

Table 5. Meat-use transition frequency matrix

 Beef Pork Poultry Seafood Other Beef 20 14 6 0 8 Pork 16 11 6 1 2 Poultry 5 4 3 2 2 Seafood 2 1 0 0 0 Other 4 6 3 0 10

Dividing each fij element by its row-sum produces the matrix of transition probabilities, P. in table 6 (i.e. a matrix of probabilities estimated from relative frequencies of proportional occurrences). Therefore, each pij element denotes the probability of change (or stability) from one meal to another. Thus, the probability of beef being followed by beef, p11 = 20/48 = .417; and p12 = 14/28 = .291 is the probability that beef will be followed by pork. All rows are interpreted similarly. Notice that all Pij's are non-negative and each row sums to unity. Thus, P provides the probabilities of remaining in or changing state over a one-day interval.

Table 6. Meat-use transition probability matrix

 Beef Pork Poultry Seafood Other Beef .417 .291 .125 .000 .167 Pork .444 .305 .167 .02x .056 Poultry .313 .25 .187 .125 .125 Seafood .667 .333 .000 .000 .000 Other .174 .261 .13 .000 .435

With the data transformed into these arrays, we can use some matrix algebra and theorems of Markov chains to ask some pertinent questions of the model. First, what is the expected distribution of meat use after e-days, given an initial distribution on some prior day? For example, if two people had beef, three had pork, and one had "other" on one day, what is the expected distribution of meat use two days into the future? To answer this question we use a fundamental equation of a Markov chain:

(72) pt + n = ptpn

where pt is a row vector, the ordered components of which denote the initial probability distribution across the states; pn is the matrix of transition probabilities; n is an exponent indicating the number of trials (or days); and pt + n is the resultant vector, the ordered components of which denote the probability distribution across states after t + n trials (or days). Thus, to extrapolate the meat-use distribution after pt and Pē, the transition probability matrix squared or

(73) pt + 2 = pt

Two matrix operations are involved. The first is powering a matrix (successively multiplying a matrix by itself), which can be defined as

where the letters represent the pij matrix elements of P. The second operation is premultiplying the product matrix by the row vector pt. This can be defined as

In our example,

The initial probability distribution pt can be found by dividing the frequency of each element by the sum. Thus 2 beef, 3 pork, 0 poultry, 0 seafood, 1 other is transformed into

pt = (2/6, 3/6, 0, 0, 1/6)

or

pt = (.333, .50, .00, .00, .167)

The vector-matrix product, pt + 2, which specifies the probability distribution expected after two days, is given by the resultant vector:

Thus, in two days it is expected that the proportion of the sample having each meat will be .36 beef, .29 pork, .15 poultry, .03 seafood, .17 other. Frequencies can be found by converting the proportions to the nearest whole number. We want to emphasize that a forecast can be made n days into the future simple by successively exponentiating P to the appropriate n-th power and then pre-multiplying it by pt. We also mention, but will not discuss, the possibility of reversing the procedures and retrodicting prior distributions. In either case, if reality data are available, these can then be compared to the expected distributions to evaluate the accuracy of the model. We will illustrate this next.

Recall that the second objective of our study was to extrapolate the ultimate, long-term distribution of meat use. Remarkably, Markov chains allow us to see this at a glance. How this is done depends on the type of Markov chain. If some power of P has all positive, non-zero elements (as is true here - Pē) then the chain is regular and some power of P will have identical rows. At this power, further exponentiation of P will not change the values of the elements. Any row of this "fixed-state" matrix, Pe, specifies the equilibrium vector, pe which is the ultimate probability distribution. The equilibrium vector, pe, can be found more easily by solving the matrix equation

(74) pe = pe P

where pe is the ultimate equilibrium vector and P is the original transition probability matrix. The latter can be defined as

from which the following system of equations are derived:

(75a) p1x1 + p21x2 =x1

(75b) p12x1 + p22x2 = x2

Now recalling that x1 and x2 must sum to 1 we add

(75c) x1 + x2 = 1

With these three equations and two unknowns we drop either (75a) or (75b) (the number of equations must equal the number of unknowns), set them equal to zero, and solve the resulting system simultaneously for x1 and x2. In our example the ultimate equilibrium vector, pe, will contain five components, so we pre-multiply P by a five-component row vector of unknowns (x1, x2, x3, x4, x5). A system of five equations is produced, plus the equation

x1 +x2 + x3 + x4 + x5 = 1

or

(76) .417x1 + .444x2 + .313x3 + .667x4 + .174x5 =x1
.291x1 + .305x2 + .250x3 + .333x4 + .261x5 = x2
.125x1 + .167x2 + .187x3 + .000x4 + .130x5 = x3
.000x1 + .028x2 + .125x3 + .000x4 + .000x5 = x4
.167x1 + .056x2 + .125x3 + .000x4 + .435x5 = x5
x1 + x2 + x3 + x4 + x5 = 1

which when solved simultaneously gives the equilibrium vector

(77) pe = ( 375, .286, .143, .027, . 169)

This equation specifies the long-term expected proportions of meat use. Using the actual proportions of meat use over the last 21 days, we can compare this expected distribution with that observed, and evaluate the model's goodness of fit to reality (table 7). The actual observations of the meat-use distribution over the last 21 days can be seen to be in 97 per cent agreement with the predicted distribution deduced from the model.

A X2 test of goodness of fit, with 5 - 1 = 4 degrees of freedom, where

(78) = 1.04

and where O = observed frequencies, E = expected frequencies, and S = number of states (meats), has a probability p>.90. This test shows that the expected and observed distributions do not differ significantly, and that any difference between them is largely due to chance. Overall, the fit is excellent and suggests that the meat sequence for this sample can be modelled quite accurately with a Markov chain.

Finally, let us turn to our third objective and see how a Markov chain can be used to study the meat-use cycle. Here we want to know what the expected (mean) number of days will be before meat Sj is used, given that meat Si was used last. For instance, if beef was eaten today, how many days, on average, will it be before beef is eaten again, pork is eaten, poultry is eaten, and so on? With a regular Markov chain these values are obtained from a matrix of mean first passage times, M. Since the mathematical computations are lengthy and involved the reader is referred to Kemeny and Snell (1960), and we will proceed directly to the results in table 8.

Table 7. Comparison of expected and actual meat-use distributions

 Proportion expected Proportion observed Expected frequency Observed frequency Beef .375 .380 47.25 48 Pork .286 .262 36.04 33 Poultry .143 .143 18.02 18 Seafood .027 .04 3.40 5 Other .169 .175 21.29 22

Table 8. Matrix of mean first passage times

 Beef Pork Poultry Seafood Other Beef 2.67 3.55 24.75 35.33 8.11 Pork 2.48 3.50 23.79 33.78 9.12 M = Poultry 2 96 3.92 6.99 21.52 9.75 Seafood 1.81 3.33 58.93 37.04 9.29 Other 3.57 3.70 24.61 35.33 5.92

Each mij element denotes the number of trials (days) it will be (on average) before a particular type of meat is used, according to the meat that was used last (table 8). Thus, m11 = 2.67 is the expected number of days before beef is re-used, m12 = 3.55 is the expected number of days before pork is used if beef was used last, etc. All rows of M are interpreted similarly. It might be noted in closing that, although this analysis has been performed for the group as a whole (n = 6), a Markov-chain analysis of each individual could also be performed and then compared. This procedure would also enable us to examine the assumptions of the model in much more detail.

In sum, Markov chains, as one type of stochastic process model, appear useful for linking past, present, and future events in an explicit way. While many of the assumptions are rather stringent and, for long-range forecasting, re-estimates of the models' parameters are usually necessary, over the short run Markov models tend to be quite robust, even when all assumptions are not completely satisfied. Hopefully, this pilot study will stimulate further exploration of the many potential uses of Markov chains in nutritional anthropology.

Poisson Process

We now provide a brief example of the application of another stochastic process model, the common Poisson distribution (Feller, 1957). The Poisson is normally used to represent stochastic processes operating continuously over some unit of measurement such as time and space and to generate the expected number of occurrences of events therein. The major assumptions governing a Poisson process are: (a) there is a positive constant l (lambda), the average rate of occurrence, which remains the same for all units; (b) the occurrence of events is independent (i.e. the occurrence of one event does not condition the probability of another event); (c) the probability of one occurrence in a single unit is proportional to the size of the unit; and (d) the probability of two or more occurrences in a small unit is infinitely small. The Poisson probability function, p, is derived from these assumptions:

(79)

where x is the number of occurrences in a given unit of measurement; l = the mean of the distribution; and e = 2.71828 is the base of the natural logarithm. What can be seen is that the expected number of occurrences depends on A, the average number of occurrences per unit of measurement.

Example 9

Let us now use a Poisson process to represent an empirical problem, and in so doing illustrate the required computations.

It is widely known, and a matter of international concern, that animal-source proteins are frequently in short supply in the diets of many tropical populations. This is true of a rural parish we studied in the Buganda region of Uganda. The major portion of the diet consists of non-fat carbohydrates: plantains, sweet potatoes, cassava, and yams. Animal proteins (dairy products, eggs, meat, fish, and poultry), though available locally, are expensive to purchase and produce. A 24-hour recall of foods consumed was collected from a social survey of a random sample of 107 household heads (HH) conducted over a six-week period in 1967. The reported frequency and per cent distribution of animal-source protein food use is shown in table 9.

Table 9

 Frequency Percentage Beef 11 10.28 Fish 11 10.28 Eggs 2 1.9 Milk 2 1.9 Tea/coffee with milk 19 17.76 Poultry 2 1.9 Termites 1 .9 None 70 65.42

With no other information available we assumed that: (a) the use of animal protein food was a random occurrence; (b) the animal protein use-proneness of each HH was the same; (c) the use of an animal protein by one HH was independent of use by another HH; (d) the use of one animal protein food did not condition the probability of the use of another; and (e) the probability of HH using two or more animal protein foods was small. For these reasons we thought a Poisson process would accurately generate the probability of animal-source protein-use occurrences. The actual use distribution is shown in table 10.

Table 10

 No. of protein foods Frequency Proportion 0 70 .654 1 28 .262 2 7 .065 3 2 .019

Using the probability-generating function (79) to calculate the expected number of occurrences, we proceed as follows:

Table 11.

 Expected number Observed number Expected proportion Observed proportion 0 68.27 70 .638 .654 1 30.71 28 .287 .262 2 6.85 7 .064 .065 3 1.07 2 .01 .019

Comparing these expected occurrences with the actual occurrences reveals close agreement (97.5 per cent) (table 11).

A Chi-square test of goodness of fit, with a value of 1.09 with 2 degrees of freedom (df = number of classes -1 and -1 for each parameter estimated) shows no significant difference between the two distributions (p > .50).

We conclude, therefore, that the Poisson distribution provides a close approximation to the distribution of actual occurrences of animal protein-use in this sample population. Further, it is indicated that the assumptions of the Poisson model characterize the animal-source protein-use process in this region.

Before closing, it might be informative to consider some reasons why a discrepancy between the Poisson distribution of expected values and the actual distribution of observed values might have occurred. First, the Poisson assumes that l = .449, the protein-use proneness value, is identical for all HH. If the sample population were heterogeneous in this respect, a discrepancy could occur. Second, the Poisson requires that the use of one protein food does not affect the use of another. If several protein-source foods were consumed together, or eating one protein food caused others not to be eaten. this could produce a discrepancy. Third, the Poisson requires independence of protein-use occurrence among HH. If protein-food exchanges occurred among several HH, producing simultaneous use (or no use), then this, too, could create a discrepancy.

In sum, departures from a Poisson process often occur when the sample population is heterogeneous in occurrence proneness; reinforcement causes one occurrence to condition the probability of another occurrence; and contagion reduces the independence of cases. Since the differences between the expected and actual distributions were small, these conditions are probably not present to a significant extent in this example. Had they been, or if it were assumed that these conditions do in fact characterize the nature of some process, then other stochastic process models with assumptions predicated on these conditions would have to be explored.