This is the old United Nations University website. Visit the new site at http://unu.edu
Contents - Previous - Next
This chapter contains descriptions of the elements that describe a food and its components, using the same format as chapter 4. The elements that identify the food components themselves appear in a separate publication , and elements that describe the data (e.g., the statistics being reported and the quantities of food measured) appear in chapter 6.
|The <food> element is an
immediate subsidiary of the <infoods 85> element.
If <dflt> is used, the first <food> element
will come third in the sequence of those immediately
subsidiary elements. Each <food> element includes
data about a single food. There may be any number of
<food> elements in an interchange file.
This is a structural tag, and both start-tag and end-tag are required. The content is a <classif> element, optionally followed by <fddflt>, <comp>, and/or <drvd-comp> in that order; this list is extensible. Normally at least one of <comp> or <drvd-comp> will be present, but for special purposes a valid interchange file might contain food description (classification) information only. All immediate subsidiaries of the <food> element are structural elements.
The <food> element contains all the data in the interchange file about a single food. All of the identifying information is in the <classif> subsidiary element; the data about directly measured and derived components of the food are found in the <comp> and <drvd-comp> elements. If there are any defaults to be applied on a per-food basis, they are specified in the <fddflt> element.
The distinction between <comp> and <drvd-comp> information is whether or not the food "component" in question is a real, directly measurable component, or is instead a measure that is computed from other "real" components. This decision is made when the generic identifier for the component is established. See Identification of Food Components for INFOODS Data Interchange  for additional discussion.
Each component has a tag/generic identifier separately established (by registration) that identifies which component is being reported on. But each food does not: new foods can be accommodated into interchange files at will, whereas new components of interest must be registered. The result of this is that there must be associated with each <food> a collection of identifying data. This is the reason for a separate <classif> component with a large number of possible subelements.
|The <classif> element is the
first immediate subsidiary of the <food> element.
It contains information that distinguishes the particular
food to which the component data within the <food>
element refers from other foods.
Both start-tag and end-tag are required. The content includes one required immediate subsidiary element (numbers in parentheses indicate the page numbers on which the element descriptions begin):
and one or more of these optional subsidiary elements:
or any of the set of elements identified in this document or by subsequent registration as <specific classification> (77) or <food description> (78). (At least one <bvname> element should be included, possibly associated with an <exname>.) Each of the optional elements may be repeated; <ifri> may appear only once.
<Classif> is a structural tag, so all elements subsidiary to it require end-tags and do not have trailing slashes in their generic identifiers.
The <classif> element identifies the particular food to which the component data within the <food> element refers, and the source record of that data. In particular, the International Food Record Identifier uniquely identifies the food and the data source for the component data included for that food in the interchange file. The IFRI should always be given (see <ifri> in this section). The <srcdbid> elements provide any record identifiers other than the IFRI that may be appropriate. <bvname> and <exname> elements provide various names by which the food is known.
All elements immediately subsidiary to <classif> are treated as structural i.e., they do not contain a trailing slash in their generic identifiers but nonetheless require end-tags.
|The <ifri> element is an
immediate subsidiary of the <classif> element and
the only required subsidiary element of <classif>.
It specifies the International Food Record Identifier of
the data about the particular food being reported. As
discussed below, the International Food Record
Identifier, sometimes called the "IFRI" or,
less formally, the "food record identifier" or
"record identifier'', is a critical concept for the
monitoring of data as they pass between data bases and
Both the start-tag and end-tag are required.
The content of <ifri> is a single unformatted data item consisting of that character string which is the registered International Food Record Identifier of the data included in the <food> element of which <ifri> is a subsidiary.
Assigning the International Food Record Identifier
While the interchange system can be used for other purposes such as data storage and management or interchange within regions, when interchange files are passed between regions, this element must be supplied and must be valid. The criteria for validity are simple: the food record identifier must be internationally unique, and must be able to be used to reconstruct the data source.
In order to assure this, the record identifier itself is formatted, using ordered facets, the restricted ISO 646 character set, and no embedded blanks or other whitespace characters. The first of those facets is a two- or three-letter code for the region or international organization; these codes are assigned by the INFOODS Secretariat, in consultation with the regional data centre or organization involved. The structure of any remaining facets is the responsibility of the region, as long as uniqueness is guaranteed. We expect that most regions will use a system that results in record identifiers with a structure similar to:
Region. Country.Publisher. SpecificDataBaseAndDate.SequenceNumber with "publisher" replaced by "laboratory" or "agency" as appropriate. That structure is used in the examples in this book, but remains optional. This type of approach also makes it possible for a regional centre to further delegate the assignment of food record identifiers to individual data compiler organizations by assigning each organization a leading set of facets and then permitting them to assign their own sequence numbers or, possibly, additional intervening facets.
Where it is sensible, regions are urged to adopt the conventions and identifiers of the ISO system for the identification of organizations . That system uses a similar structure, with more global registration authorities (e.g., national bodies) assigning their own identification and then delegating assignment of particular identifiers (e.g., to different state or provincial organizations).
When dealing with older data, it may be very difficult to determine whether a food record is "original" or whether it is a CODY from another data base. When no determination can be made, a new food record identifier should be assigned; it is better to make errors in the direction of asserting that two records arc different when they arc not than to assert identity when it does not exist. From the user's point of view, this implies that identical food record identifiers-an assertion that data values in records are the same-can be trusted as evidence that the data are identical and come from an identical source. Differing record identifiers are to be taken as no more than a strong hypothesis that the data have different origins. Whether different food record identifiers are to be trusted more or less as an indication that the corresponding data records represent different origins, e.g., separate analyses, must be evaluated on the basis of other information.
Nothing in the system implies that a region cannot assign a food record identifier of the form:
Region.wdkwtcf.Sequence where "wdkwtcf" can be read as "we don't know where this came from". Again, this is acceptable as long as the record identifier is unique and the regional centre will take responsibility for it. Once the record identifier is assigned, everyone else knows where the data came from: whoever in the region assigned the record identifier.
Identical food record identifiers in two different interchange files do not, however, imply that the two interchange file <food> elements are identical. If a data base compiler inserts food 75 in table B. which contains only proximate values, by copying the proximates (only) from food 200 in table A, then the food record identifier provided when table B is passed between regions should reflect food 200 in table A. This is true even if that identifier in table A refers to a record-a <food> element-that contains many other nutrients.
For composite records, for example, the compiler might use proximate data from food 201 in table A, but substitute protein data from food 3 in table C, to build food 76 in table B. The food record identifier in table B should be a new one, associated with table B. The combination of values from the two other tables causes the food record in table B to be "original values" for which the developers of table B must take responsibility. The principle used here is discussed in more detail in Chapter 1.
If an interchange file is used for purposes other than data interchange between regional centres, we recommend that valid food record identifiers be assigned nonetheless. Errors are much less likely if the record identifiers are assigned as early and as close to the time and point of first "publication" (circulation of data values outside the laboratory or organization where they are compiled) as possible. If, for some reason, it is desired to retain or transport data in interchange form without the food record identifiers attached, use of <srcdbid> is recommended to transmit any other food record identifying information.
Finally, it is important to understand that the food record identifier does not, in any usual sense, identify a "food"; it identifies a particular collection of data, a "food record". In the most extreme case, an interchange file might contain nothing but a collection of raw laboratory data on samples of a single food. One might then use the <dflt> element to specify name and classification information for the entire interchange file, since this information would be identical.
On the other hand, since the individual <food> elements would contain different data, their food record identifiers would be different. If those data were ultimately aggregated to yield a single set of values for the food in another table, that set would be assigned a new food record identifier, since the process of aggregating the raw laboratory data yields a new set of values.
The International Food Record Identifier value is intended to serve two purposes: for the nutritionist, the value should evolve to provide a definitive answer to the question
"Is this value copied from another table and, if so, which one?"; for the data base manager, it is a key that guarantees uniqueness of each food record that is, in fact, unique.
Assignment of Regional Identifiers
Actual identifiers for regional groups and other delegated IFRI allocators will be assigned by the INFOODS Secretariat. Identifiers assigned at the time of this writing are shown in Appendix A.
Since EUROFOODS has not, at the time of this writing, created an IFRI system, the first example is only illustrative of what might be done.
The second example designates the data record associated with boiled, sweetened adzuki beans in the 1972 FAO Food Composition Table for Use in East Asia .
|Each <srcdbid> element is an
optional immediate subsidiary of a <classif> It
identifies a food and its data with respect to the source
data base. "Srcdbid" may be thought of as an
abbreviation for "source data base identifier''.
However, it applies at and is subsidiary to the
<food> level, and consequently identifies a food
record in the source data base, not the source data base
as a whole.
Both start-tag and end-tag are required. The content of <srcdbid> consists of one unformatted data item, followed by optional immediately subsidiary <cmt/> elements. This element is optional, and is provided for the convenience of data base compilers in keeping track of the relationship of their data bases with interchange files prepared from them. The actual content of a specific <srcdbid> is at the discretion of the data base compiler or organization preparing the interchange file.
Each <srcdbid> relates a record in the original data base to the data contained within the <food> element to which this element is subsidiary. In most tables and data bases, the final identification of food information is a sequence number, possibly used in combination with a page number or its equivalent. That sequence number may explicitly reflect a food grouping, as in the USDA Standard Reference Database , or may just reflect the sequential organization of various versions of the table, as in recent editions of McCance and Widdowson .
In most cases, this sequence number will be encoded in the required international food record identifier (see the <ifri> description). However, it need not be: as discussed under that element, assignment of those identifiers is left to regional decisions and conventions. If the sequence number is not incorporated in the international food record identifier and it is desired to keep track of it, or if it is desired to isolate it from the international food record identifier for some other reason, this element would typically be supplied, with the sequence number as its content.
The example below illustrates the relationship in the more typical case, where the sequence number is also coded into the international food record identifier. There are at least two other situations, in which this correspondence is less likely, and a separate <srcdbid> would be important:
|The <bvname> element is a
repeatable immediate subsidiary of the <classif>
element. It specifies a name by which the associated food
is known, all of whose characters are drawn from the
character table specified in ISO 646 as the "Basic
Version" . It is complementary to
<exname>, which permits a broader selection of
characters. "Bvname" can be thought of as an
abbreviation of "basic-version name".
Both start-tag and end-tag are required. The content is an unformatted data item, optionally followed by a <lang> element and by optional <cmt/> elements, which would typically be used to describe the use of the particular name.
The <bvname> element contains a single unformatted data item consisting of the characters that make up a name for the food whose data is within the <food> element to which this element is subsidiary. This name must be in or transliterated into the restricted ISO 646 character set normally required of all data in an interchange file. <Lang> may be used, if desired, to designate the language in which the name appears (directly or in transliteration). <Bvname> differs from <exname> in that the latter can support names written in any character set.
<bvname> cake <lang> en
This example illustrates the use of <bvname> and <exname> together to express the name of a food in both the minimal character set and a character set more suited to another language. <Bvname> should be used for "cake", since that uses only basic characters; <exname> is required for the other names.
If all three of the names above were used as part of the <classif> element for the same food, they might illustrate the difficulties in "translating" this type of name.
|The <exname> element is a
repeatable immediate subsidiary of the <classif>
element. It specifies a name by which the associated food
is known that uses an extended (rather than the basic)
character set. "Exname" can be thought of as an
abbreviation for "extended name".
Both start-tag and end-tag are required. The content is an unformatted data item, followed by both <lang> and <charset> immediate subsidiaries, which are required. The content may also include one or more <cmt/> elements, typically used to specify the context in which the name is used. <Exname> differs from <bvname> in that the latter uses a restricted Latin-based character set, while <exname> permits a wide range of character sets and languages.
The <exname> element consists of a single unformatted data item consisting of the characters that make up a name for the food whose data is within the <food> element to which this element is subsidiary. It is used when this name requires a character set other than the ISO 646 Basic Character Set  to which most of the elements of the interchange file are restricted, as discussed in Chapter 3. The name may be expressed in any language recognized by ISO for which an appropriate computer character coding exists. That data item is followed by required <lang> and <charset> subsidiary elements and may be followed by one or more <cmt/> elements.
The language must be specified by the <lang> immediate subsidiary element. The character set must be specified by the <charset> subsidiary element.
Except in special circumstances, <exname/> should not appear without <bvname> at the same level; since many receivers of interchange files will be unable to completely process names in international alphabets, names should always be provided transliterated into the standard restricted alphabet as well as in their original alphabet.
See the examples under <bvname> on the previous page and under <lang> and <charset> on the following pages.
|The <lang> element is an
immediate subsidiary of any element whose data item may
be in a specifiable language. In particular, it is an
optional subsidiary element for <bvname> and a
required subsidiary element for <exname>.
The start-tag is required; there is no corresponding end-tag. The content of <lang> consists of either one keyword or an unformatted string and ends when another tag is encountered.
The content of c <lang> is a keyword drawn from the ISO 639  lower-case two-letter code for the language intended. If no such code exists, then the language may be described by a name or phrase of more than two characters, which must be expressed in the restricted ISO 646 character set generally permitted for interchange file data. If ISO 639 contains a code for the language, the code, rather than a phrase, should be used.
<exname> nød <lang> da <charset> 88591 </exname>
See the description of <bvname> for additional examples.
|The <charset> element is an
immediate subsidiary of those specific elements whose
data are permitted to be in character sets other than the
ISO 646 character set. <Exname> and <ianame/>
are the only such elements in the initial edition of this
list. <Charset> may not be used without
A start-tag is required; there is no corresponding end-tag. The content of a <charset> element consists of two formatted data items, both numerals.
Any element permitting <charset> as an immediate subsidiary must have an unformatted data item; the <charset> element specifies a character set (other than the restricted subset of ISO 646 that is normally required) by which the unformatted data item is to be interpreted.
At present the only alternative character sets permitted are those standardized in the ISO 8859 series [48-52]. Until and unless the definition of this element is expanded, the first numeral of the <charset> content must be "8859". The second data numeral designates which of the registered character sets is intended (they are numbered by the ISO standardization process).
Currently, all ISO 8859 character sets retain the "less-than" and "greater-than" signs in their ISO 646 positions. Only such character sets are acceptable under this interchange specification, in order to preserve the ability to easily recognize tags.
Use of character sets other than the ISO 8859 group is not currently anticipated, although a universal character-set standard may be permitted if one comes into general use. The "8859" data item is nevertheless required to permit extension to include other standards if it should become desirable in the future (such extension would require registration [see chapter 7] and amendment of the description of the element) and as a contingency against future changes in the way standard character sets are organized and registered.
<exname> TBOROG JIRHbI <lang> ru <charset> 8859 5 </exname>
See the description of <exname> for additional examples.
|The <image> element is a
repeatable immediate subsidiary of the <classif>.
It introduces one or more subsidiary elements which, in
turn, provide a picture or drawing of the food.
Both start-tag and end-tag are required. The content consists of one or more elements that indicate the picture encoding type and provide the actual image. A <cmt/> element may also be included.
The <image> element contains elements only. The first of these elements must be an image format designating element (see below). Additional elements may appear or even be required depending on which image format element is chosen, but a particular <image> element may contain only one image format designating element. A <cmt/> element may be used, and should be supplied when possible, to describe the image. Images should be used with caution in interchange among regions: while a picture is often worth a thousand words, it may require the equivalent of many thousand characters to store. The large files this implies may be burdensome for some data receivers.
Image Format Designating Elements
At present, there are several different ways to represent pictures and drawings for interchange purposes but none of them have emerged as a clear standard, supported in most computer systems from which image display would be desired. The two formats described below are well-documented and widely available. However, this element may be extended by adding additional (alternative) image format designating elements in the future. Please inquire with the INFOODS Secretariat before using any image formats in inter-regional interchange of food composition data.
<g3fax2/> indicates a content that is the encoded format of a Group III fascimile, as specified in CCITT Recommendation T.4. Only the two-dimensional format is supported. The content is the encoded image itself further encoded into the standard "base64" format so that all information transmitted is in character format and protected from unexpected network transformations. Line break characters are ignored within the content. The end-tag is required; other than <cmt/>, no other element of <image> may accompany <g3fax2/>.
<gif/> indicates a content that is a color graphic image encoded in the CompuServe "GIF' format . The content is the encoded image itself further encoded into the standard "base64" format as discussed above. The end-tag is required; other than <cmt/>, no other element of <image> may accompany <gif/>.
<image> <gif/> image in base64 coding of the CompuServe GIF format </gif/> <cmt/> leaf detail of Conium maculatum </cmt/> </image>
Contents - Previous - Next