A file encoding scheme for formal metadata

Revised 7-February-1997

Since the Content Standard for Digital Geospatial Metadata, as the name implies, specify only the contents of metadata files and not their encoding, it was necessary to devise this specification for metadata encoding in order to develop and use the metadata compiler. The encoding format is purely textual and the fidelity of the compiler to this format is fanatical.

In general, this encoding format uses an outline-like list of element names in an ASCII text file. The hierarchy of the Standard is encoded explicitly and is expressed using indentation.
Note: mp does not read word-processor documents, it only reads XML or plain text!

Terms:

tab
ASCII 9
space
ASCII 32
element name
A sequence of bytes consisting of alphanumeric characters, the underscore, hyphen, apostrophe, and forward slash. This sequence is one of the formal names given to metadata element names in the formal syntax specification of the metadata content standard. Examples:
  • Citation:
  • Identification_Information
  • Data_Set_G-Polygon_Outer_G-Ring
  • Range_of_Dates/Times
value
A text string associated with an element by the author of the metadata record.
compound element
An element that exists to contain other elements. These are used to convey the hierarchical relationships among component elements.
CR
ASCII 13, carriage return
LF
ASCII 10, line feed

Arrangement:

  1. Metadata files contain only ASCII characters; lines may be terminated using CR, LF, or CR followed by LF.
  2. Each file contains exactly one metadata record.
  3. The number of characters per line is not limited.
  4. Indentation is accomplished using tab characters or spaces. A single tab character represents the same degree of logical indentation as a single space character.
  5. Blank lines may occur anywhere in the file.
  6. Element names are spelled out in the metadata file exactly as in the syntax rules of the metadata content standard.
  7. A single colon or equal sign may follow each element name.
  8. Spaces or tabs may occur between element name and colon or equal sign, and may occur after the colon or equal sign.
  9. Values are associated with an element in one of three ways:
    1. The value begins at the first nonblank following the element name (or following colon or equal sign) and extends to the end of the line. Example:
            Originator: Beeblebrox, Zaphod
      
    2. The value begins on the line following the element name. It is indented more than the element name, i.e. there are more spaces or tabs preceding the value than precede the element name. Example:
            Title:
              Geometeorological data collected by the USGS Desert Winds
              Project at Gold Spring, Great Basin Desert, northeastern
              Arizona, 1979 - 1992
      
    3. The value begins on the line containing the element name. It extends onto subsequent lines, where it is indented more than the element name, i.e. there are more spaces or tabs preceding the value on lines following the element name than precede the element name. Example:
            Title: Geometeorological data collected by the USGS Desert Winds
              Project at Gold Spring, Great Basin Desert, northeastern
              Arizona, 1979 - 1992
      
  10. Compound elements must be specified if any of their components (or their component's components, and so on) contain text values.
  11. Compound elements contain only elements and not text. No extra text may be included as a child of a compound element. Example:
    Metadata:
      Identification_Information:
        (no plain text is permitted here, only component elements)
        Citation_Information:
           Citation_Information:
                Originator:
                Publication_Date:
    
  12. Components of compound elements occur on successive lines using the same degree of indentation. In the example below, the components of Citation_Information are indented the same, and the components of Series_Information are indented the same, but are more indented than their parent element. Degree of indentation does not have to be uniform throughout the file but all of the children of a specific compound element must be indented the same way.
        Citation_Information:
           Citation_Information:
                Originator:
                Publication_Date:
                Publication_Time:
                Title:
                Type_of_Map:
                Series_Information:
                    Series_Name:
                    Issue_Identification:
                Publication_Information:
                  Publication_Place:
                  Publisher:
                Other_Citation_Details:
                Online_Linkage:
                Larger_Work_Citation:
    
  13. The lines in a textual value may have variable indentation as long as they are all more indented than the element name to which they belong. However, this indentation will be lost in subsequent processing of the metadata, so it is not recommended. Example:
          Title:
            Geometeorological data
                collected by the USGS Desert Winds
              Project at Gold Spring, Great Basin Desert, northeastern
                           Arizona, 1979 - 1992
    
    In the output of mp and other metadata-processing tools the additional indentation in the text will be lost, and the text of the title will appear as follows:
          Title:
            Geometeorological data
            collected by the USGS Desert Winds
            Project at Gold Spring, Great Basin Desert, northeastern
            Arizona, 1979 - 1992