This is the old United Nations University website. Visit the new site at

Contents - Previous - Next



The American National Standards Institute, the body responsible for ratification and distribution of standards in the United States.


Consisting of alphabetic and numeric symbols.

closing delimiter

A character or tag that indicates where a data field or element ends. Usually matched to a similar "opening delimiter", e.g., the symbol " is typically used as a closing delimiter for a data field (which is typically called a quoted string or a quotation) that begins with the opening delimiter ".


The content of an element may consist of one of three forms: (1) data alone with no subsidiary elements (data content), (2) data followed by one or more elements (mixed content), or (3) one or more elements without preceding data (element content).

Element with data content:

<VITB12> 1.2 </VITB12>

Element with mixed content:

<NA> 0.12 <unit/> MMOL </unit/> </NA>

Element with (multiple) element content:

<ifri> food_record_identifier </ifri> </classif>
<meas> measurement elements </meas>
<comp> food component data elements </comp>
<drvd-comp> derived food component and descriptive data elements
</drvd-comp> </food>

conversion specification file

A file that specifies the conversion process between the data formats of a particular system and that of another. It typically includes field locations and lengths and an indication of field content (e.g., particular food components represented).


In cases where a tag is defined in terms of one or more other tags (for example, "similar to another tag but slightly different," ''defines a subset of what another tag defines," "refers to the same substance but by a different analytical method," and so on), those other tags are termed cross-references.

data value

A single numeral or string representing the value for a particular field (e.g., a nutrient) for a particular entity (e.g., a food). Sometimes called a "datum".

edible portion

The fraction of a food or food product typically eaten and on which analyses are usually based. The perception of what is edible can differ from one culture to another, so the edible portion should be carefully described.


An element consists of a start-tag followed by its content followed, if the start-tag requires it, by a matching (i.e., same generic identifier) end-tag. See examples under "content", above, and the discussion in Chapter 3.


An end-tag is a tag that marks the end of an element and is preceded by the content of that element. By convention, an end-tag has the form <generic identifier>. See "start-tag" and "generic identifier" in this section, and Chapter 3.

If the start-tag of an element is <food> the end-tag is </food> . Likewise, if the start-tag of an element is <unit/> the end-tag is </unit/>.


An element is described as extensible if, upon sufficient justification, additions can be made to the keywords or elements that can be used in its content.

fixed-field system

A system for organizing data such that each item occupies a preset, and universally agreed upon, number of columns. Each item is located by measuring off a fixed distance-determined by the number of characters in each of the preceding fields- from the beginning of the record. These systems can be made quite efficient from a programming standpoint and are easy to program. They do require that blank space be left for all nutrients that are not supplied, so the number of characters wasted will be very large when various data are not available.

formatted data

Data content associated with a particular element that must appear in some particular form and order. In general, the data content is everything appearing between a start-tag and the corresponding end-tag or, for elements that have a start-tag but no end tag, the next tag in sequence. If the data are formatted, the description of the element will specify exactly what may appear, and in what order. The alternative is "free text" (see below), also called "unformatted data" or "unformatted text".

free text

Text, consisting of alphabetic, numeric, and punctuation characters, that is not restricted as to format or structure. The usual alternatives are numeric values, keywords, and elements. In the context of the interchange system, free text is usually referred to as "unformatted data" or "unformatted text".


Computer software, or the machine on which it runs, that links facilities such as networks with different protocols and performs translations among them. A "mail gateway" is one that converts electronic mail and address formats between one network and another.

generic identifier

That pan of a tag which is enclosed between the opening " <" or " </" and the closing "> ", exclusive of qualifiers such as the "85" in <infoods 85>, is termed the generic identifier. The generic identifiers of the start-tag and end-tag of an element are identical. For the tag <food>, the generic identifier is "food". For the tag <unit/>, the generic identifier is "unit/". For the tag <infoods 85>, the generic identifier is "infoods".


The international food record identifier, a regionally assigned identification code for food data records (tables or data bases).

immediately subsidiary

An element or data value is described as "immediately subsidiary" to another one when there are no intervening nested elements. In "<xx> A <yy> B </yy> </xx>", A and <yy> are immediately subsidiary to the <xx> element, and B is immediately subsidiary to the <yy> element, but, while B is subsidiary to the <xx> element, it is not immediately subsidiary, since <yy> intervenes.

A given data value or element can be immediately subsidiary to only one element.

interchange format

The actual structure of a data file used in INFOODS data interchange. One component of the overall "interchange system", where other components include the "tags" or generic identifiers that identify various pieces of information, the conventions for locating and requesting data files, the mechanisms for assigning international food record identifiers, and the computer programs and operational arrangements for regional data centres.


The International Organization for Standardization, the body responsible for evaluating and setting standards internationally.


See right-justify.


A word, acronym, or other short sequence of characters that is chosen from a restricted list and that has specially defined meaning when used in context. See Chapter 7.


A sequence of commands to be applied by some process, typically invoked by a single command or the definition for a (possibly complex) abbreviation. The term is used in the context of text processing to describe the text to be substituted for some other text (usually repeatedly) or the instructions for making the substitutions.


Where "data" contain information about some topic, the term "metadata" is used to denote data about the data, including how they were obtained, their statistical properties, and special circumstances affecting them.


When elements are embedded in the scope or range of other elements, they are said to be nested within the ones in which they are embedded. The depth of the embedding is sometimes referred to as the "nesting level" of the elements. For example, if we have:

<PROCNT> 3.3 FAO 638 <cmt/> Note USDA values for same. </cmt/> </PROCNT>

we would describe the <cmt/> element as being nested within the <PROCNT> element. The depth of nesting has an effect on the sophistication of the computer software required to process the elements. See "immediately subsidiary", above.

opening delimiter

See "closing delimiter", above.


The term "repeatable" is applied to an element immediately subsidiary to another element. It means that the given element may occur more than once as an immediate subsidiary to the other element.


In a field of prespecified length containing text, when significant information is placed to the right end of the field it is right-justified. Le justification involves putting the information to the left end of the field. These two notions provide for a distinction that is very significant to computers, even if not to people.

" right-"
" justification"

"left- "
"justification "


The Standard Generalized Markup Language, the language for structuring text upon which the Interchange Format was designed. It is specified in International Standard ISO 8879 [53]

solidus (slant, or slash)

The solidus, /, identifies an end-tag (</generic identifier>), or a start-tag which requires an end-tag (<generic identifier/>). The term "solidus" is used interchangeably with "slant" and "slash".


A start-tag is a tag that marks the beginning of an element and is followed by the content of that element. By convention, a start-tag has the form <generic identifies>. See "end-tag" and "generic identifier", above. <Food> and <unit/> are examples of start-tag

structural element

Structural elements determine the ordering of elements in an interchange file, and define the form, or structure, of that file. Their content consists of one or more subsidiary elements only and no data. The <header>, <sender>, <source>, <food>, and <classif> elements, for example, are all structural elements.

subsidiary element

Any element which forms pan or all of the content of another element (see "immediately subsidiary" and "nesting").


In SGML, the particular symbols used to mark up a document and identify its components are called tags. In the system outlined in this memo, the tags correspond to the names for fields in the interchange file.


The term "tagname" refers exclusively to the text portion of a generic identifier, exclusive of the possible terminating slant. Although the term "tagname" has been used extensively in certain INFOODS documents, it is not interchangeable with "generic identifier".

unformatted data

Data, usually a string of characters, which may contain whitespace characters. The beginning and end of an unformatted data string are typically marked by tags or by some other type of opening and closing delimiters. Equivalent to "free text" and "unformatted text".


An informal term for "data value", qv.


A character, or sequence of characters, that are used to separate keywords, numerals, or elements. This term is used because these characters appear as spaces, or sequences of spaces, on the printed page. In computer character coding terms, the whitespace characters are space, horizontal tab, vertical tab, and the new line sequence.

Contents - Previous - Next