- Tools for creation of formal metadata
- Metadata in Plain Language
How was the data set created?
- From what previous works were the data drawn?
- How were the data generated, processed, and modified?
- Does it matter when these modifications were made?
- Did someone other than the formal authors do the processing?
- What similar or related data should the user be aware of?
Describing data gathering and processing
The overall genesis of a data set is described as a series of
process
steps in which previously-created data
sources are used and
new data sources are built. In the diagram below, data sources A and B
were used in process step 1 to create data source C. Data sources C and D
were used in process step 2 to create the subject data set. The
Lineage
in the metadata for the subject data set therefore contains four
Source_Information elements and two
Process_Step elements.
So the metadata for the subject data set would contain the following
elements:
Data_Quality_Information:
(other data quality elements ...)
Lineage:
Source_Information:
Source_Citation:
Citation_Information:
(more details about the source's citation ...)
Source_Citation_Abbreviation: A
(more details about the source ...)
Source_Information:
Source_Citation:
Citation_Information:
(more details about the source's citation ...)
Source_Citation_Abbreviation: B
(more details about the source ...)
Source_Information:
Source_Citation:
Citation_Information:
(more details about the source's citation ...)
Source_Citation_Abbreviation: C
(more details about the source ...)
Source_Information:
Source_Citation:
Citation_Information:
(more details about the source's citation ...)
Source_Citation_Abbreviation: D
(more details about the source ...)
Process_Step:
Process_Description: Process 1 ...
Source_Used_Citation_Abbreviation: A
Source_Used_Citation_Abbreviation: B
Source_Produced_Citation_Abbreviation: C
Process_Date: 1998
Process_Step:
Process_Description: Process 2 ...
Source_Used_Citation_Abbreviation: C
Source_Used_Citation_Abbreviation: D
Process_Date: 1998
Note that Source C is
produced in one process step and
used
in another. Note also that the elements
Source_Used_Citation_Abbreviation and
Source_Produced_Citation_Abbreviation
appear only in a
Process_Step, while the
element
Source_Citation_Abbreviation
appears only in a
Source_Information.