DBFmeta is a software tool designed to facilitate the documentation
of data contained in DBF files. These data files are increasingly
common because they are used to store attributes of geographic
features in shapefiles used and produced by ESRI products. DBFmeta
comes with mp, a parser for formal metadata,
and is most effectively used with Tkme, an
editor for formal metadata.
Note that the program you run is called dbfmeta (all
lower-case letters) but like Tkme, I refer to it using the mixed-case
DBFmeta can be run in "interactive mode" in which the program asks
the user for information that is not carried by the DBF format. To
engage this behavior, include the command-line switch -i.
Use of non-standard elements
DBFmeta includes in its output three elements that are not part of
the FGDC standard itself but are included in the ESRI profile that
is used by Arc8. These elements are
If you want DBFmeta to not include these elements in its output,
add the command-line switch -strict.
Specifying an output file
DBFmeta will try to write its output into a disk file named
dbfmeta.out. If that file already exists or if it cannot be
created, you will be prompted to enter another file name. You may
specify the name of the output file on the command line using the <
If the file name you specify ends with .sgml or .xml,
the output file will be written using XML tags. No SGML or XML
declaration or processing instructions will be included, however.
This output form will work as either SGML or XML.
The full command-line syntax for DBFmeta is therefore
where braces indicate an option that may be omitted.
What it is and what to do with it
DBFmeta creates a snippet of metadata that can be pasted into the
left-side window of Tkme if Entity_and_Attribute_Information
is selected there. After including the output of DBFmeta into
a record in this manner, the metadata can be completed using Tkme's
normal editing capability.
What you need to write in yourself
DBFmeta has no way of knowing some information, so it includes some
empty elements in the output that the user must fill in:
What are the things that each row of the dbf file describes?
DBFmeta omits those elements that are used to indicate the
published source of a feature class, attribute definition, or
Added htmlspecialchars() function to properly translate <, >, and & in XML output of values listed in Enumerated_Domain as text. Added a command-line option -udom label to enable the user to avoid listing distinct values of a field and instead use Unrepresentable_Domain for them. Thanks to Rob Norheim for suggesting these.
Fixed bug in which XML tag was not properly closed. Thanks to John Graves for letting me know about this problem.
Fix code calculating the maximum number of nonblank characters in a string field. Added code to read integer fields as long long 64-bit) signed integers. This is more complex but might be more reliable for long numerical code attributes the DBF file thinks are integer. Thanks to Hugh Phillips for pointing out the issue with long integers.
Modified to allow use to omit descriptions of specific fields by name.
For character fields, find the length of the longest value, treating trailing blanks as if they didn't exist. Report this to the user.
Fixed bug introduced by last fix; enlarge the space allocated for each string value by one to hold the terminating nul byte (D'oh!).
Fix code so that the single real number can occur more than once and still be considered a special value. Modified dbfopen.c to not strip space from string values by default.
Open the input file before worrying about naming the output file. Fixed bug that caused the verbose statistical report to not show intege values. Describe in an Enumerated_Domain a singleton real number. Thanks to Hugh Phillips for catching these problems.
Modified to output SGML or XML if the output file name looks like it has that extension.
Modified to open dbf file read-only and to report duplicate names to stdout rather than stderr.
Added code to report the number of blank records in a string field.
Revised handling of attribute information by reading all fields at once. This enables me to check for duplicate field names.
Added code to skip attributes whose names match those that are intrinsic to Arc/Info. Fixed bug in counting unique string values.
Modified handling of enumerated domains to give some statistical information about the values.
To-do list: known bugs and limitations
When a field contains real numbers, the current code does not correctly
discover the presence of one or two negative values and call them out
as special values using an Enumerated_Domain element.
Peter N. Schweitzer
Mail Stop 954, National Center
U.S. Geological Survey
Reston, VA 20192
Tel: (703) 648-6533
FAX: (703) 648-6252