- Tools for creation of formal metadata
A file encoding scheme for formal metadata
Revised 7-February-1997
Since the
Content Standard for Digital Geospatial Metadata, as the
name implies, specify only the contents of metadata files and not their
encoding, it was necessary to devise this specification for metadata
encoding in order to develop and use the
metadata compiler.
The encoding format is purely textual and the
fidelity of the compiler to this format is fanatical.
In general, this encoding format uses an outline-like list of element
names in an ASCII text file. The hierarchy of the Standard is encoded
explicitly and is expressed using indentation.
Note: mp does not read word-processor documents, it only reads XML or plain text!
Terms:
- tab
- ASCII 9
- space
- ASCII 32
- element name
-
A sequence of bytes consisting of alphanumeric characters, the
underscore, hyphen, apostrophe, and forward slash. This sequence
is one of the formal names given to metadata element names in the
formal syntax specification of the metadata content standard.
Examples:
- Citation:
- Identification_Information
- Data_Set_G-Polygon_Outer_G-Ring
- Range_of_Dates/Times
- value
-
A text string associated with an element by the author of
the metadata record.
- compound element
-
An element that exists to contain other elements. These are used to
convey the hierarchical relationships among component elements.
- CR
- ASCII 13, carriage return
- LF
- ASCII 10, line feed
Arrangement:
- Metadata files contain only ASCII characters; lines may be terminated using CR, LF, or CR followed by LF.
- Each file contains exactly one metadata record.
- The number of characters per line is not limited.
- Indentation is accomplished using tab characters or spaces. A single tab character represents the same degree of logical indentation as a single space character.
- Blank lines may occur anywhere in the file.
- Element names are spelled out in the metadata file exactly as in the syntax rules of the metadata content standard.
- A single colon or equal sign may follow each element name.
- Spaces or tabs may occur between element name and colon or equal sign, and may occur after the colon or equal sign.
- Values are associated with an element in one of three ways:
-
The value begins at the first nonblank following the element name (or following colon or equal sign) and extends to the end of the line. Example:
Originator: Beeblebrox, Zaphod
-
The value begins on the line following the element name. It is
indented more than the element name, i.e. there are more spaces or
tabs preceding the value than precede the element name. Example:
Title:
Geometeorological data collected by the USGS Desert Winds
Project at Gold Spring, Great Basin Desert, northeastern
Arizona, 1979 - 1992
-
The value begins on the line containing the element name. It
extends onto subsequent lines, where it is indented more than
the element name, i.e. there are more spaces or tabs preceding
the value on lines following the element name than precede the
element name. Example:
Title: Geometeorological data collected by the USGS Desert Winds
Project at Gold Spring, Great Basin Desert, northeastern
Arizona, 1979 - 1992
-
Compound elements must be specified if any of their components (or
their component's components, and so on) contain text values.
-
Compound elements contain only elements and not text. No extra
text may be included as a child of a compound element. Example:
Metadata:
Identification_Information:
(no plain text is permitted here, only component elements)
Citation_Information:
Citation_Information:
Originator:
Publication_Date:
-
Components of compound elements occur on successive lines using the
same degree of indentation. In the example below, the components
of Citation_Information are indented the same, and the
components of Series_Information are indented the same,
but are more indented than their parent element. Degree of
indentation does not have to be uniform throughout the file but all
of the children of a specific compound element must be indented the
same way.
Citation_Information:
Citation_Information:
Originator:
Publication_Date:
Publication_Time:
Title:
Type_of_Map:
Series_Information:
Series_Name:
Issue_Identification:
Publication_Information:
Publication_Place:
Publisher:
Other_Citation_Details:
Online_Linkage:
Larger_Work_Citation:
-
The lines in a textual value may have variable indentation as long
as they are all more indented than the element name to which they
belong. However, this indentation will be lost in subsequent
processing of the metadata, so it is not recommended. Example:
Title:
Geometeorological data
collected by the USGS Desert Winds
Project at Gold Spring, Great Basin Desert, northeastern
Arizona, 1979 - 1992
In the output of mp and other metadata-processing tools the
additional indentation in the text will be lost, and the text of
the title will appear as follows:
Title:
Geometeorological data
collected by the USGS Desert Winds
Project at Gold Spring, Great Basin Desert, northeastern
Arizona, 1979 - 1992