USGS - science for a changing world

Formal metadata: information and software

MP: A compiler for formal metadata

Revision history

Date Description of revision
4-Dec-2014 Modified decode_text() in text.c to examine the text checking character encoding. Generates a warning if the encoding was not specified but the characters are not UTF-8, generates an error if the encoding was supposed to be UTF-8 and there are non-UTF-8 characters.
Modified write_contact() in html.c to remove a stray greater-than sign. (mp 2.9.28)
24-Jan-2014 Modified write_xml_item() in xml.c to minimize the number of newlines printed when indentation of the XML output is requested using a config file. Thanks to Drew Ignizio for suggesting this revision. (mp 2.9.27) (xtme 2.7.23) (tkme 2.9.29) (mq 2.6.27)
2-Dec-2013 Modified translated() in html.c so that, in the FAQ format, URLs are not enclosed in angle brackets unless that is how they appear in the source metadata. Thanks to Peg Shealy for pointing out this now-obsolete feature. (mp 2.9.26)
Also make live links for https.
24-Sep-2013 Modified read_text() in text.c as well as utf8_of() in encoding.c and cns.c so that after the text is read, a test is carried out to determine whether the text is likely UTF-16, and if so, the UTF-16 is converted to UTF-8 for further processing. The test in this case is to determine whether the text contains any zero bytes; if it does, the text is assumed to be UTF-16. This is because I occasionally see problems in which people using Microsoft Windows have inadvertently converted their text files to UTF-16, which is really a binary format for unicode characters. This will certainly scramble any input files that are binary but are not UTF-16, but those files aren't going to be readable metadata anyway. (mp 2.9.25) (cns 2.8.5) (xtme 2.7.22) (tkme 2.9.28) (mq 2.6.26)
12-Sep-2013 Modified element_start() and element_end() in xml.c to consistently calculate the current line number in XML input files. Thanks to Aleta Vienneau for pointing out this problem. (mp 2.9.24)
1-Jul-2013 Modified keyword.h, keyword.c, and syntax.c to accommodate the metadata extensions for lidar data published in USGS TM11-B4. That document does not include long element names, so I made up long element names to match the tags and definitions given in appendix 5 of the report. (mp 2.9.23)
18-Jun-2013 Modified write_text() and write_text_item() in text.c so that they no longer translate from UTF-8 to ISO-8859-1. This change simply reflects my increasing comfort with UTF-8. (mp 2.9.22) (xtme 2.7.21) (tkme 2.9.22) (mq 2.6.25)
25-Mar-2013 Modified parse_text() in text.c so that, if character encoding has not been explicitly declared in a config file, the text is scanned for conformance with UTF-8, and if the text conforms to UTF-8, the encoding is considered to be UTF-8 in further processing. This allows people to process text files that are, in fact, UTF-8 without having to use a config file to say so. (mp 2.9.21) (xtme 2.7.20) (tkme 2.9.21) (mq 2.6.24)
11-Dec-2012 Added config option "head" within output:html; if specified, the contents of that config file element will be included verbatim at the end of the head element of the HTML output. Could be used for any static <meta>, <style>, <link>, or <script> information. (mp 2.9.20)
7-Dec-2012 Modified allow() and check_extension() in syntax.c to include in the variable message only the first 512 characters of the textual value on the given input line if that value would be longer than 512 characters; this prevents a buffer overflow when reporting errors. (mp 2.9.19)
8-Nov-2012 Modified check_Metadata() in syntax.c to generate a warning if there is no Data_Quality_Information, Spatial_Data_Organization_Information, Spatial_Reference_Information, Entity_and_Attribute_Information, or Distribution_Information. (mp 2.9.18)
31-Oct-2012 Modified StartElement() in xml.c to correctly handle the case in which the first element encountered in the metadata record is an unrecognized extension. (mp 2.9.17)
3-Oct-2012 Significant changes to the error reporting system for mp, affecting numerous source files. Now if the error file name ends with .xml, the errors will be written as XML, including, where possible, an xpath description of the location of the element closest to the problem. In addition, within html.c, trailing space is trimmed from name components that might be rearranged in order to present the citation as a typical bibliographic reference entry. (mp 2.9.16)
4-Apr-2012 Modified dif.c to properly account for the possibility that, with an empty Fees element, the corresponding text pointer might be NULL; such is the case when the input file is XML. This caused a seg fault in the online validator for some input files. (mp 2.9.15)
23-Sep-2011 Modified element_start() in xml.c, adjusting the line number to account for the entity declarations we may have parsed in decode_xml() before we began parsing the user's actual XML document. Without this adjustment, line numbers are 66 more than they should be. Thanks to Stuart Giles (USGS) for pointing this problem out. (mp 2.9.14)
19-Aug-2010 Modified text.c and xml.c to disregard a UTF-8 byte order mark if one is present at the beginning of the text or XML input file. (mp 2.9.13)
26-Apr-2010 Modified write_contact() in html.c to replace "c/o" with "Attn:" when there is a Contact_Person within Contact_Organization_Primary. (mp 2.9.13)
11-Feb-2010 Modified check_scalar_children() in syntax.c so that valid values of Clock_Time_Drift are permitted (needed a negative sign in the first clause of the if statement). Thanks to Rudi Gens for pointing this problem out.
Modified write_xml_text() in xml.c so that it does not translate single quote into the entity &apos; and double quote into the entity &quot; I believe these translations are not necessary. Thanks to Dougl Dale-Johnson for pointing this problem out. (mp 2.9.12)
11-Dec-2009 Modified parse_name() in html.c to detect the situation where there are multiple commas or an "and" in the name given, and consider the value as a non-parseable name (that is, one that cannot be re-expressed as "last, first middle". (mp 2.9.11)
2-Nov-2009 Modified html.c to remove unnecessary hyphen from "Frequently-anticipated questions". Modified html.c to enlarge the fixed sizes allowed for personal names in the FAQ HTML format. Modified xml.c to cope properly with UTF-8 characters at the end of an element's text. Under Microsoft Windows XP these were truncated due to unexpected behavior of the isspace() function on that system. Thanks to Aleta Vienneau for information leading to this fix. Modified build process for Microsoft Windows to use the MinGW compiler, creating Makefile.mgw for this purpose. (mp 2.9.10) (mq 2.6.23) (xtme 2.7.19) (Tkme 2.9.24)
30-Jan-2009 Modified write_sgml_text() in sgml.c to behave like write_xml_text() in xml.c: don't convert characters to entity references. This allows UTF-8 characters to be put into the SGML output as they are, and we just hope that software downstream can be told they are UTF-8. mp 2.9.9
16-Sep-2008 Modified mp.c to write the date and time into the error log file as information. (mp 2.9.8)
5-May-2008 Modified keyword.c, adding support for Portuguese element names kindly supplied by
 Luis Cavalcanti Bahiana
 Pesquisador em Informações Geográficas
 IBGE- Coordenação de Geografia
 Av. República do Chile 500-12 andar
 Brazil
(mp 2.9.7) (cns 2.8.4) (xtme 2.7.18) (tkme 2.9.23) (mq 2.6.22) (err2html 2.1.9)
22-Jan-2008 Modified element_start() and element_end() in xml.c to not null-terminate the output data buffer if the buffer pointer is NULL. This condition arose when an end tag was encountered without the parser ever seeing any character data. Thanks to Gennady Khokhorin for noticing the problem. (mp 2.9.6) (xtme 2.7.17) (tkme 2.9.22) (mq 2.6.21)
27-Sep-2007 Modified main() in mp.c to include the original input file name in the error output. Modified write_html() and write_html_faq() in html.c to include the original input file name as a <meta> tag in the HTML output, with name="generated-from".
Modified upgrade() in upgrade.c so that Metadata_Standard_Version is upgraded only if it appears within Metadata_Reference_Information. The only way I can imagine it being anywhere else is in the metadata for mp itself, where it had occurred within the text of a couple of Process_Description elements. (mp 2.9.5)
20-Sep-2007 Modified check_scalar_children() in syntax.c so that if an element that is supposed to contain some value contains instead another element (this can happen if someone misplaces an element in XML), it doesn't crash but instead generates a "misplaced element" error. Came to attention by examining web server error logs of the online metadata validation service. (mp 2.9.4)
5-Mar-2007 Modified syntax.c to allow Oblique_Line_Azimuth and Oblique_Line_Point within Map_Projection_Parameters instead of Oblique_Line_Latitude and Oblique_Line_Longitude. Thanks to Matthew McCready (Census Bureau) for pointing this mistake out. (mp 2.9.3)
3-Oct-2006 Modified upgrade() in upgrade.c so that no change is made if there is exactly one *_Keyword_Thesaurus in a Theme, Place, Stratum, or Temporal element, regardless of its position within that element. (mp 2.9.2)
30-Jun-2006 Modified mp.c so that -fixdoc is detected before -f in the list of command-line arguments. This re-enables -fixdoc (which should be rarely needed, but I needed to use it today!) (mp 2.9.1)
11-May-2006 Modified encoding.c to fix UTF-8 encoding. Thanks to Olga Vasik (TINRO, Vladivostok, Russia) for persistence in following up on this issue. Because this has the effect of making UTF-8 encoding really work for HTML output, I'm bumping up the 2nd-level version number. (mp 2.9.0)
4-May-2006 Fixed a few problems with the implementation of UTF-8 conversion, particularly in HTML output files. (mp 2.8.30)
3-May-2006 Modified config.c to recognize element "encoding" to be used under input (as a substitute for 'codeset') or under output:html. Modified html.c to allow the output to be written as UTF-8 if output:html:encoding is set to UTF-8 in the config file, otherwise HTML output defaults to ISO-8859-1. I believe this will make it possible in principle to support Russian text. (mp 2.8.29)
31-Mar-2006 Modified element_start() in xml.c to flag as an error any non-blank text found in the text buffer when an element start tag is found in XML input. Previously, mp was holding onto the text and would assign it to the next element for whom a close tag was encountered. So it was possible to have text in the wrong place but mp would put it somewhere else. This is surely a rare and strange condition, but it is one that ought to be flagged as a significant error if it occurs. (mp 2.8.28) (xtme 2.7.15) (tkme 2.9.20) (mq 2.6.18)
28-Feb-2006 Integrated support for German element names and composed a German help file for the editors. This is based on the work of Peter Korduan (University of Rostock). The help file includes information about the precision agriculture extension that was the subject of his work. Created a help file for French as well, using the translation of the Standard provided by Environment Canada.
Modified html.c to avoid a crash when reading an XML file whose Citation Title contains blank lines. Thanks to Hanna Habashy (Univ. South Carolina) for pointing to this problem.
(mp 2.8.27) (cns 2.8.3) (xtme 2.7.14) (tkme 2.9.19) (mq 2.6.17)
27-Jan-2006 Modified actions.c to eliminate duplication in the remote-sensing profile element Instrument_Information that was causing XML files to contain duplicate information. Also modified upgrade.c to replace the Metadata_Standard_Version value only if the existing value does not begin with "FGDC-STD-" rather than the more specific "FGDC-STD-001". Thanks to Pete Keehn (NOAA) for pointing these problems out. (mp 2.8.26) (mq 2.6.16) (xtme 2.7.12) (tkme 2.9.18)
21-Nov-2005 Modified syntax.c to flag as an error the co-occurrence of more than one type of domain within Attribute_Domain_Values. Modified uppgrade.c to try to patch affected metadata by inserting additional Attribute_Domain_Values elements where there are extra domain values sections. (mp 2.8.25)
26-Sep-2005 Modified mq.c to add a subcommand prune for a parsed metadata record. (mq 2.6.15)
22-Sep-2005 Modified subcommands of mq. Fixed "attach" so that it returns an error when called without the address of a snippet to attach. Changed "name", "line", "parent", "child", "prev", and "next" so that these work when the address given is a detached subtree (snippet). Previously these subcommands required the address to be a valid node in the current parse tree. (mq 2.6.14)
29-Aug-2005 Modified mp.c to not upgrade metadata by default if the file format is XML. The upgrade process was making mischief when a keyword thesaurus element was out of order. (mp 2.8.24)
15-Jun-2005 Modified the French-language name of the element Quantitative_Attribute_Accuracy_Assessment. Thanks to John Cree (Environment Canada) for pointing this problem out. (mp 2.8.23) (cns 2.8.2) (xtme 2.7.12) (tkme 2.9.17) (mq 2.6.13)
5-Apr-2005 Modified write_html_faq() to pass the value of Unrepresentable_Domain through write_html_value() rather than simply writing it out. My opinion is that the text in Unrepresentable_Domain ought to be short and uncomplicated (that is, not containing >'s) but I see no reason not to accommodate those who would put lists or other semi-structured information there. (mp 2.8.22)
11-Feb-2005 Modified url fragments in links.c so that spaces and punctuation are replaced with the corresponding hex code, like '%20' for space. (err2html 2.1.7)
10-Feb-2005 Modified write_html() and write_html_faq() in html.c to accept the stylesheet element in the config file in much the same way as it is handled for xml. In this case the stylesheet type can be omitted, in which case the type will be assumed text/css. The value href will be used as the link to an external CSS. This information is written into the head element as a link element: <link rel="stylesheet" type="text/css" href="..."> Furthermore it is possible to have more than one stylesheet element within output:html; each generates a separate <link> element in the HTML output. (mp 2.8.21)
28-Jan-2005 Modified keyword.c, replacing array fr_std[] with new French-language element names provided by John Cree. This completes support for French for both standard elements (those contained within the 1998 CSDGM) and the Biological Data Profile. At this writing, elements from the shoreline and remote sensing profiles remain untranslated (that is, the "French" versions of those extended elements are the English. (mp 2.8.20) (cns 2.8.1) (xtme 2.7.11) (tkme 2.9.15) (mq 2.6.12) (err2html 2.1.6)
12-Jan-2005 Modified config.c to recognize a new directive "schema" within output:xml.
Modified xml.c to write the value given in output:xml:schema as one of the attributes of the root element (Metadata) when writing XML. If a schema is specified in the config file and the root element has no other attributes, then the root element will be assigned the attribute xmlns:xsi with the value http://www.w3.org/2001/XMLSchema-instance along with the attribute xsi:noNamespaceSchemaLocation, whose value will be the text given to output:xml:schema in the config file. (mp 2.8.19) (mq 2.6.11) (tkme 2.9.14) (xtme 2.7.10)
4-Nov-2004 Modified equalize_indented_scalars() in text.c so that parent links and indent values are set properly when correcting text containing variously-indented lines. Thanks to Jo Anne Stapleton for giving me enough data to find and correct this rare problem. (mp 2.8.18) (mq 2.6.10) (xtme 2.7.9) (tkme 2.9.13)
17-Sep-2004 Modified check_Map_Projection() in syntax.c to not allow Other_Projection's_Definition within Map_Projection. Modified check_Map_Projection_Parameters() to allow Other_Projection's_Definition to occur there.
Modified numbers.c to assign the correct new section number for Other_Projection's_Definition.
Modified actions.c to move Other_Projection's_Definition from Map_Projection to Map_Projection_Parameters. This allows tkme and xtme to know where to let you put it.
Modified write_html_faq() in html.c to better handle the occurrence of Other_Projection's_Definition, and also of Map_Projection_Parameters in Map_Projection.
Thanks to James W. Allor (US Census Bureau) for pointing this out.
Modified upgrade(). If Other_Projection's_Definition appears within Map_Projection, move it into Map_Projection_Parameters. If no Map_Projection_Parameters exists, create one for this. (mp 2.8.17) (err2html 2.1.4) (tkme 2.9.12) (xtme 2.7.8)
15-Sep-2004 Modified write_html() and write_html_faq() in html.c to translate the title from UTF-8 to ISO-8859-1 if the metadata are stored in UTF-8. The code already carried out this translation for the text within the metadata but I had neglected to do the same for the title, specifically when writing the html <title> element within the document head and when writing the heading tag <h1> in outline style, <h3> in FAQ style. (mp 2.8.17)
14-Sep-2004 Expat automatically converts ISO-8859-1 to UTF-8, so if you read an XML file, the internal representation of characters will be UTF-8. I still think it's good to store the plain text as ISO-8859-1, so I now need to convert UTF-8 to ISO-8859-1 when writing plain text.
Modified write_text_item() in text.c to convert UTF-8 characters to ISO-8859-1. Created a separate module encoding.c to contain the function unicode_of_utf8(), now global in scope. Removed the copies of that function found in html.c and xml.c. (mp 2.8.16) (mq 2.6.9) (tkme 2.9.11) (xtme 2.7.7)
13-Sep-2004 Modified character_data() in xml.c to allow text of more than 8k characters to be ingested at a time. This became a problem if you had an XML metadata record containing a textual value that was larger than 8192 bytes all on the same line. (mp 2.8.16)
16-Aug-2004 Modified check_Digital_Form() in syntax.c so that Digital_Transfer_Option is not repeatable. Thanks to Eric Compas (NPS, UWisc) for pointing this problem out. (mp 2.8.15)
11-Aug-2004 Modified check_Range_of_Dates_Times() within syntax.c so that it generates a missing element error if you have a Beginning_Date but no Ending_Date. This problem only occurred when using the bio profile. Thanks to Terry Giles (USGS) for pointing this out. (mp 2.8.14)
3-Jun-2004 Modified parse_sgml() within xml.c so that parsing of SGML actually begins with <metadata> rather than at the beginning of the file. This is because expat chokes on the SGML DOCTYPE declaration, which doesn't have the same form as an XML DOCTYPE. Modified decode_xml() in xml.c to first parse a set of entity declarations if the record begins with <metadata>. These entity declarations are not included if the file begins with an XML declaration. The chief use of them is to help parse SGML records previously created using mp. Thanks to Sarah McGuire (NPS, Madison) for pointing this issue out. (mp 2.8.13) (xtme 2.7.6) (tkme 2.9.9) (mq 2.6.8)
12-Apr-2004 Modified element_cmd() in mq.c to return the section number from the standard document (1998 version) using the element command like [element $name -number]. Note that section numbers are highly problematic; they only refer to a specific version of the standard document, some of the numbers have changed from the 1994 version to the 1998 version, one of the elements now has two different section numbers (Online_Linkage occurs both within Citation_Information and within Metadata_Extensions), and section numbers do not occur at all in the biological or remote sensing profiles. (mq 2.6.7)
5-Apr-2004 Modified array rule_table in syntax.c to properly direct mp to run check_Bounding_Altitudes when a Bounding_Altitudes is detected (previously this entry was not present, and so the Bounding_Altitudes element was checked as though it were a scalar element). Thanks to Terry Giles for pointing this problem out. (mp 2.8.12)
13-Feb-2004 Modified config.c and local.c to send to an internally-managed buffer any error or warning messages generated while parsing the config file or the extension files. The contents of these buffers can be retrieved by calling config_errors() or ext_errors(). So the calling program (typically mp or Tkme) decides if, when, and how those messages are shown to the user. Modified mp.c to show those messages after the banner is displayed. (mp 2.8.11)
29-Jan-2004 Created a new extension file ESRI-ISO.ext within tools/ext. I built this from the document type definition file found inside http://www.esri.com/metadata/esri_iso01dtd.zip
Since I don't understand the ISO documents fully I cannot judge how well the DTD corresponds to the ISO standard. I have used the new ext file to parse metadata output by ArcCatalog 8.3, and it appears to work, causing mp to understand and not discard the ISO elements that ESRI has used. In principle this should also have the effect of allowing people to edit ArcCatalog metadata in place using Tkme without having to export the record first. However caution is advised: make a backup copy beforehand, and keep it until you are certain that both mp and ArcCatalog can read the XML file successfully.
29-Jan-2004 Modified the element command in mq so that it distinguishes among the known profiles (biological data, shoreline, remote-sensing) that are built into mp. (mq 2.6.5)
26-Jan-2004 (sigh) Fixed bug in xml input code, where a textual value larger than 8k bytes would cause a segmentation violation. No new behavior other than this. (mp 2.8.10) (tkme 2.9.6) (mq 2.6.4) (xtme 2.7.4)
23-Jan-2004 Modified decode_xml() in xml.c to use expat, a freely-available XML parser, rather than the original parser code. This will make the input of XML documents more consistent, and fix the bug in which the occurrence of a comment within the text of an XML element caused the parser to discard the entire text (thanks to Hugh Phillips for noticing this problem).
Backed off a prior change in which unrecognized extensions were recognized and not discarded. Tkme wasn't written to distinguish unknown compound extensions from scalar elements, so its behavior became inconsistent when unknown extensions were permitted. (mp 2.8.9) (mq 2.6.3) (tkme 2.9.5) (xtme 2.7.3)
7-Jan-2004 Modified allow() in syntax.c to correct bug in the previous change. Modified decode_xml() in xml.c to generate a warning when an unknown extension is encountered. (mp 2.8.8)
18-Dec-2003 Modified keyword.c to add a function is_standard() which, given an element's numerical key, tells whether the element is part of the FGDC standard (through FGDC-STD-001-1998) or not.
Modified allow() in syntax.c so that only a warning is issued, not an error, when a non-standard element appears where it is not expected to be. This is a significant change from the regulatory perspective. It means that mp will not call an error the occurrence of an extension even if mp doesn't know what the extension is or where it's supposed to go. The chief benefit of this is that ESRI and other creators of metadata tools can now invent and add elements to the metadata they create without me having to play catch-up and try to get accurate descriptions of what these extensions are and where they should go.
This doesn't mean that downstream software will understand the information in nonstandard elements, but if your metadata record was generated by some software, mp won't tell you it's wrong just because there's a nonstandard element in it that mp doesn't know about ahead of time. (mp 2.8.7)
23-Sep-2003 Modified local.c, creating a function add_extension() that can be called while parsing an XML file to add an unrecognized tag as an extension (albeit one about which we know little). Modified decode_xml() in xml.c to call this function. The net effect of this change is that unrecognized elements in XML and SGML files will be tolerated and can be manipulated, rather than ignored and possibly discarded altogether. The practical effect, I hope, is that XML records produced by any version of ArcCatalog can be edited safely using Tkme and parsed with mp. Ideally, mp would be told exactly where the extended elements go and how they should appear, but with the emphasis on XML, it is possible to include new tags in metadata without documenting them beyond inclusion in a DTD, and the DTD step could conceivably be skipped also, since well-formed XML suits most applications. (mp 2.8.6) (mq 2.6.2) (tkme 2.9.3) (xtme 2.7.2)
13-Aug-2003 Modified insert_item_after(), insert_item_before(), delete_children(), and delete_item() in tree.c to simplify their structure and, I believe, fix a bug that caused an inserted element to not appear in the tree if an adjacent element was deleted. These changes should improve Tkme but have their biggest effect on mq, so mq's version number is incremented. mp would not be affected. (mq 2.6.1)
6-Aug-2003 Modified parts of html.c to make the output html conform to "HTML 4.01 Transitional", giving both the outline-style and the FAQ-style HTML output the DOCTYPE declaration
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">
Thanks to Florence Wong (USGS) for pointing out the need to make some fixes in the code to achieve better conformance. (mp 2.8.5)
5-Aug-2003 Modified actions.c to add Description_of_Geographic_Extent within Spatial_Domain in shprofile_element array. Modified syntax.c to allow Description_of_Geographic_Extent in Spatial_Domain and Barometric_Pressure within Marine_Weather_Conditions when using shoreline profile. Thanks to Mike Moeller (NOAA) for pointing these problems out. (mp 2.8.4)
23-Jul-2003 Modified do_meta_tags() in html.c so that the dc.lang element is not always "en", but could be the value of a Language or Metadata_Language element or the value assigned to the input:language option in the config file. It does not (yet) pick up a language specified on the command line with the -l switch. Thanks to Terry Giles for pointing this issue out. (mp 2.8.3)
28-May-2003 Modified upgrade.c to better handle blank lines within the Keywords sections and multiple Theme, Place, Stratum, and Temporal elements. Thanks to Tobin Smith (Veridyne) for bringing the problem to my attention. (mp 2.8.2)
20-May-2003 Modified write_html() and write_faq_html() to look for an option omit_url within output:html; if present, the URL is not written out at the end of the file. (mp 2.8.1)
7-Mar-2003 Added direct support for the shoreline profile and the remote-sensing profile. This entails modifications in keyword.h, keyword.c, syntax.c, ps8.c, and actions.c. All programs got bumped a minor version number for this.
One of the side effects is a simplification so that all elements defined in any of these profiles will be recognized and their structure checked. However unless you put "profile rs" or "profile sh" into the input section of the config file, standard elements won't be judged according to the rules specified in the profile. So if you use elements from one of the profiles but you don't tell mp you're using that profile, those elements may be flagged as errors because mp thinks they aren't supposed to be there. But that, in my opinion, is better than having them come up as unrecognized elements. (mp 2.8.0) (mq 2.6.0)
25-Feb-2003 Modified decode_xml() in xml.c so that when a comment is found, the buffer is emptied if it only contains whitespace to that point. Thanks to Archie Warnock for showing the problem. (mp 2.7.35)
4-Feb-2003 Modified write_html() and write_faq_html() in html.c to remove carriage returns from included script files and included style files (it already did this to included header files. This avoids having extra end-of-line characters in the output on Windows systems. (mp 2.7.34)
1-Oct-2002 Modified write_xml_item() in xml.c so that when asked to write elements in standard order it uses a more straightforward method, avoiding find_key(). This allows it to discover child elements of the same type as the parent. Thanks to Helena Schaefer for alerting me to this problem. (mp 2.7.33) (mq 2.5.17) (xtme 2.6.10) (tkme 2.8.19)
10-Sep-2002 Modified decode_xml() in xml.c to handle properly a comment at the beginning of the XML input file. Thanks to Daniel Berhanu for bringing this problem to light. (mp 2.7.32) (xtme 2.6.9) (Tkme 2.8.18) (mq 2.5.16)
29-Aug-2002 Modified check_Standard_Order_Process() in syntax.c so that it generates an error (type "missing") message when the given Standard_Order_Process contains neither a Digital_Form nor a Non-digital_Form. Thanks to Terry Giles for pointing this out. (mp 2.7.31)
10-Jul-2002 Modified copy_item() in tree.c so that it properly handles cases where p->d is NULL. This affected copy operations in Tkme where the input file was XML. Thanks to Hugh Phillips for pointing this out.
Also made extensions not case sensitive when encountered in indented text input (XML tags are always case sensitive). (mp 2.7.30) (mq 2.5.15) (cns 2.7.3) (xtme 2.6.8) (tkme 2.8.17) (err2html 2.1.2)
2-Jul-2002 Modified check_date() in syntax.c so that zeros beyond the first four decimal digits are not considered an error IF the date has one of the prefixes bc, cc, or cd. Thanks to Travis Stevens for pointing this problem out. (mp 2.7.29)
13-Jun-2002 Modified config.c to recognize style_file and script_file. Modified html.c to look for these elements within output:html and include the contents of the files they name within the head element of the HTML output. (mp 2.7.28)
10-May-2002 Modified write_html_faq() in html.c to fix bug that caused fault where Single_Date/Time has no Calendar_Date inside it. Thanks to Aleta Vienneau for pointing this out. (mp 2.7.27)
9-May-2002 Modified sgml.c and xml.c so that parsing an SGML document is actually done by the XML parser code. There's a problem with the old SGML parser code that I see no good reason to find and fix. The only negative effect of this change is that ASTM tags for SGML metadata are no longer supported. But I suspect that no one has ever used ASTM tags for metadata, and that no one ever will. (mp 2.5.26) (mq 2.5.14)
16-Apr-2002 Modified html.c to use write_html_value() instead of just munge() for expressing the Entity_Type_Definition and Attribute_Definition in FAQ-style HTML. (mp 2.7.25)
25-Feb-2002 Modified syntax.c to note as an error the use of 00 in the month and day parts of a date, as in "19950000". I see no provision in the available documentation to suggest that 00 is a legitimate value for either the month or the day. (mp 2.7.24)
28-Jan-2002 Modified attach_subtree_in_order() in mq.c to fix bug in which the subtree was appended to the end of the root node even after it had been placed properly. This caused a loop to occur in the tree. (mq 2.5.13)
26-Nov-2001 Modified mp.c to remember the character encoding of the input file. This is kept as a string, and should normally be either UTF-8 or ISO-8859-1. New global functions set_character_encoding and get_character_encoding access this attribute.
Modified html.c to get the character encoding and output UTF-8 characters correctly. Previous to this, UTF-8 characters were interpreted as ISO-8859-1, which they aren't, so Europeans who used UTF-8 got strange-looking stuff in their HTML. (mp 2.7.23)
9-Nov-2001 Modified mq so that elements or subtrees inserted or pasted without an explicit -before or -after directive will be emplaced in the order specified by the Standard. (mq 2.5.12)
31-Oct-2001 Modified config.c to recognize element "Label". Modified html.c to look for "label" within output:html:link instead of "text" in the same place, to preserve the compound nature of "text" as a config element within "output". (mp 2.7.22)
29-Oct-2001 Modified xml.c so that leading spaces in element values are skipped. This is equivalent to the behavior of mp when reading indented text files, and makes it easy to check for specific text values such as for Progress. Thanks to Travis Stevens (NOAA-NGDC) for pointing this out. (mp 2.7.21)
20-Oct-2001 Modified upgrade.c so that it leaves alone any value of Metadata_Standard_Version beginning with "FGDC-STD-001" This causes it to leave alone the specific version numbers of approved profiles of FGDC-STD-001-1998. (mp 2.7.20)
9-Oct-2001 Modified local.c to recognize the element "Z39.50_Tag" as the same as "z3950". This tag is not used by mp at present, and may never be used. The change makes the extension file just a little less code-like and a little more readable, but for practical purposes has no effect on the behavior of mp or the other programs. (version not changed)
4-Oct-2001 Modified upgrade.c to avoid seg fault when an empty Enumerated_Domain element is encountered in the input. Thanks to Tirumal R. Jallepalli for pointing this out. (mp 2.7.19)
5-Sep-2001 Modified html.c to strip carriage-returns from the text included via the header_file directive of the config file. (mp 2.7.18)
3-Aug-2001 Fixed buffer overflow in extension_of() and extension_of_sgml() in local.c. All programs affected. (mp 2.7.16) (mq 2.5.10) (cns 2.7.2) (xtme 2.6.5) (tkme 2.8.12) (err2html 2.1.1)
19-Jul-2001 Modified keyword.c to include element names in catalan provided by Dr. Ing. Carlos Lopez of Uruguay. This includes updated spanish element names as well. (mp 2.7.15) (mq 2.5.9) (cns 2.7.1) (xtme 2.6.4) (tkme 2.8.11)
26-Jun-2001 Modified upgrade.c to remove blank lines within Enumerated_Domain elements. The code there was creating an extraneous container when a blank line immediately followed Enumerated_Domain and preceded the corresponding Enumerated_Domain_Value. Thanks to Peg Rawson (USGS and National Atlas) for helping to find this bug. (mp 2.7.14)
21-Jun-2001 Modified write_faq_html() in html.c so that a &nbsp; is written if the Enumerated_Domain_Value_Definition is empty. (mp 2.7.13)
29-May-2001 Modified write_html() and write_faq_html() in html.c to look for a config option header_file if header is not specified under output:html in the config file. If header_file is found, its value is expected to be a readable text file whose contents will be incorporated into the HTML output as if they had been specified as the header in the config file itself. This allows you to maintain the HTML header separate from the config file. A corresponding change was made for footers; if the footer directive is not found under output:html, then if a footer_file directive is present, its contents will be used as the HTML footer. (mp 2.7.12)
25-May-2001 Modified write_xml_item() in xml.c to examine the static variable element_order when writing the children of an element. element_order is an enum that can be either STANDARD or ASIS (in future there could be others). If its value is STANDARD, then the elements are written in the order given in FGDC-STD-001-1998 and extensions are written after all standard elements. This is and has been the default behavior. If the value is ASIS, then elements are written in the order they appear in the parse tree (as input in mp, as modified by Tkme).
Modified config.c to recognize the element "order" which will be used when found under output:xml.
The effect of this change is to allow people to explicitly request that mp and Tkme NOT rearrange the elements in the order given in the FGDC standard. Since most profiles have not put extensions after all standard elements, using the standard order causes the extensions to be put out of the order expected in the profile. This change allows mp and Tkme to retain the input order. (mp 2.7.11) (tkme 2.8.10) (xtme 2.6.3) (mq 2.5.7)
20-Apr-2001 Modified the paste command in mq so that it takes a string or XML string as its final argument rather than the address of a previously-detached subtree. Prior to this the attach command and the paste command did the same thing. (mq 2.5.5)
26-Mar-2001 Modified syntax.c to check the text value of the BDP element Case_Sensitive and to not require Metadata_Review_Date if Metadata_Future_Review_Date is given. Thanks to Terry Giles (Johnson Controls working for USGS) for pointing this out. (mp 2.7.10)
27-Feb-2001 Modified check_Taxonomy() in syntax.c to permit more than one Taxonomic_Classification in Taxonomy. This was done at the request of the authors of the BDP to reflect their intent that the standard permit organisms from more than one kingdom to be documented in the same record. As it is formally written, the BDP makes it essentially impossible to include in the same record taxonomic classification info for both plants and animals, or plants and fungi, for example. (mp 2.7.9)
15-Feb-2001 Modified parse_sgml() in sgml.c to fix bug in handling multiline input files introduced by previous revision. (mp 2.7.8)
14-Feb-2001 Modified parse_sgml() in sgml.c to read the whole input file at once, rather than line-by-line. This avoids a limitation imposed on line length in the input file. Thanks to Margaret Lyszkiewicz for pointing this out. Also added a question to the FAQ-style HTML output, "What similar or related data should the user be aware of? for which the answer is taken from Cross_Reference. (mp 2.7.7)
9-Feb-2001 Modified check_Citation_Information() in syntax.c so that if the biological profile is being used, Geospatial_Data_Presentation_Form is mandatory. Thanks to Terry Giles for pointing this out. (mp 2.7.6)
7-Feb-2001 Modified check_Identification_Information() in syntax.c to permit Spatial_Domain to be missing if using the Biological Data Profile. In the BDP, Spatial_Domain is mandatory if applicable. Thanks to Diane Schneider and Terry Giles for pointing this out. (mp 2.7.5)
2-Feb-2001 Modified syntax.c to allow free text in Map_Projection_Name. This reflects a change in the domain of that element that was introduced with CSDGM2 (FGDC-STD-001-1998). The 1994 version of the standard restricted the names to a specific set. Thanks to Leslie Bearden for pointing this out. (mp 2.7.4)
14-Dec-2000 Modified mq.c to enable commands detach and attach. $m detach $p detaches the element p from the metadata record m. $m attach $q $p attaches the detached element p to the metadata record m as a child of q. Full syntax for attach is like that for insert: $m attach ?-child | -before | -after? <address> <detached address> and it only works if the detached item has NULL for its parent, next, and prev links. (mq 2.5.3)
8-Dec-2000 Modified translated() in html.c to not include trailing punctuation in a URL except for '/' or '?'.
Modified write_html() and write_faq_html() in html.c to give the internal hyperlinks unique names. (mp 2.7.3)
27-Oct-2000 Modified keyword.c to correct two of the spanish-language element names, doubling the second 'r' Prerequisitos_Técnicos (new spelling Prerrequisitos_Técnicos) and including an acute accent over the i in Cuadricula (in 3 element names, now spelled Cuadrícula. (mp 2.7.2)
10-Oct-2000 When I included the bio profile elements, I made it impossible to use the same elements as extensions even when the bio profile was not being used. This is because the bio profile elements are kept by mp in the same bucket that it uses for standard elements, and you can't use an extension that has the same name as a standard element. Some of my geological data sets use the geologic age extensions, which are taken from the bio profile, but they don't use the rest of the bio profile. One solution would be to simply use the bio profile for these records. In this case the geologic age elements are recognized properly, but a spurious error message is generated because the bio profile includes one mandatory element, Description_of_Geographic_Extent within Spatial_Domain. The missing element is flagged as an error.
The correct solution is for mp and friends not to know the bio profile elements unless the bio profile is used by choice. That way the same elements can be introduced as extensions in the usual way. So I modified keyword.c and made some changes to the main programs as well, to introduce a function use_element_names (language,profile). This function selects standard element names using the requested language, and adds to them the profile element names (any profile you want, as long as it's the bio profile). (mp 2.7.1) (mq 2.5.1) (cns 2.6.1)
6-Oct-2000 Added direct support for the Biological Data Profile (FGDC-STD-001.1-1999) to all programs. This involved modifications to keyword.h, keyword.c, actions.c, config.c, syntax.c, mp.c, tkme.c, cns.c, and xtme.c.
Activate support for this profile by specifying "profile bio" under "input" in the config file. (mp 2.7.0)
25-Sep-2000 Modified local.c to output more informative messages when a problem is encountered in building the extensions list. Since this does not affect the program's normal operation, I'm not changing its version number. Applies to mp, cns, xtme, and tkme.
15-Sep-2000 Modified add_related_file() in both mp.c and cns.c to add a slash to the end of the result of getcwd. Thanks to Robert Wilhite of NOAA-CSC for pointing this out. (2.6.2)
1-Sep-2000 Modified html.c to rephrase the question for which the answer is taken from the Process_Steps. Previously the question was "How were the data processed and modified?" The question is now written "How were the data generated, processed, and modified?" My hope is that this will help people to be comfortable describing data creation as process steps. (mp 2.6.1)
4-Aug-2000 Modified full_text() in html.c to check q->d before calling strlen() on it. Avoids a crash that I believe occurred when reading XML files and generating faq.html from them. This change made the same day as the change to 2.6.0 documented in the previous Process_Step, so I'm not bumping the version number for it.
4-Aug-2000 Added limited support for foreign languages. This is implemented in mp, cns, xtme, and tkme through a command-line option -l <code> where the code is "es" for Spanish and "id" for Indonesian. Preferred language can also be specified in the config file using
input
  language es
Replace es with id for Indonesian; en would be for English, but if the value is unrecognized or missing the software will use English element names.
Spanish-language element names were kindly provided by Dr. Ing. Carlos López of the Clearinghouse Nacional de Datos Geográficos, Uruguay. <http://www.clearinghouse.gub.uy/>
Indonesian-language element names were kindly provided by the Indonesian National Coordination Agency for Surveys and Mapping BAKOSURTANAL
French-language element names were kindly provided by the Canadian Center for Remote Sensing, Natural Resources Canada
Coincidentally added an extra element to the standard, at end of Metadata_Reference_Information I have added Metadata_Language. This element is not required, of course, but is permitted by mp. (mp 2.6) (mq 2.4)
3-Aug-2000 mp crashed if you had an empty Data_Set_G-Polygon element. Fix was to modified upgrade.c to look for Data_Set_G-Polygon_Exclusion_Ring only if a Data_Set_G-Polygon_Outer_G-Ring was found. Thanks to Jennifer Lenz for helping me find this bug. (mp 2.5.7)
7-Jul-2000 Modified xml.c to call strcpy() rather than decode_entities() because the entities are decoded inline at an earlier step in the parsing process; the subsequent call to decode_entities() affected only the ampersand character, which was then ignored. Thanks to Frank Roberts for having the perseverance to lead me to find this bug. (mp 2.5.6)
3-Jul-2000 Modified xml.c to write the apostrophe out in XML as &apos; and the quotation mark as &quot;. Modified entities.c to include these symbols in the ISO-8859-1 encoding table, so they are generated in the SGML output and recognized in SGML input as well. Archie Warnock indicates some XML or SGML parsers need the apostrophe and quotation marks to be encoded in this manner. (mp 2.5.5)
3-May-2000 Modified html.c to use write_html_value() instead of a simple munge() for the text associated with a Browse_Graphic_File_Description. In most cases, this text is short and unadorned and a simple output would do fine. However if the metadata writer put any &gt;'s in the value, they should be respected as preformat indicators. (mp 2.5.4)
29-Mar-2000 Modified syntax.c to accept the value "infinite" for the element Denominator_of_Flattening_Ratio. This allows people to use the sphere as a geodetic model. Thanks to Aleta Vienneau for pointing out this problem. (2.5.3)
24-Feb-2000 Modified process_metadata() enabling a -list option to the function value_of. When value_of is used with -list, the result is returned as a list of lines rather than as a single block of text. It makes no sense to have both -list and -nonewline, so writing the value_of command with both options results in an error.
 is_compound <address>
This function returns 1 if the element at the address given is a compound element, and 0 if the element is not compound. (mq 2.3.1)
22-Feb-2000 Modified mq.c to include some new functions:
value_set <address> <text>
This assigns the given text to the element at the address given if that element can contain a value. An error is reported if the address is that of a compound element.
insert ?-before | -after | -child? <address> ?<element>?
An element of the specified type is added to the tree before, after, or as a child of the element whose address is specified. If no placement is specified (that is, neither -before, nor -after, nor -child is given), the element is added as a child of the node whose address is specified. This function does NOT check to see whether the element so inserted is permitted to be there by the FGDC metadata standard.
delete <address>
The element at the specified address is removed from the tree and cannot be recovered.
copy <address>
A copy of the element at the specified address is made and its address is returned as the result of this operation. The copy has no links upward, forward, or back, and can thus be attached to any other metadata record using the paste function.
paste ?-before | after | -child? <address> <subtree>
The subtree given as the final argument is attached to the current metadata record before, after, or as the last child of the element at the address given. This works only if the address argument is a compound element in the tree.
write ?-format text | sgml | xml? <filename>
The metadata record is written out to the disk file whose name is specified as the final argument. If no format is specified using "-format <format>", then the format is text unless the output file ends with ".sgml" or ".sgm" or ".xml" in which cases the file will be written as SGML or XML.
(mq 2.3.0)
17-Feb-2000 Fixed bug in write_citation() in html.c that, when a Larger_Work_Citation was present, passed its address, rather than the address of its child Citation_Information, to write_citation() for formatting. The result was that no Larger_Work_Citation elements were being output. (2.5.2)
11-Feb-2000 Modified mq.c to express the config-file handling in a different and more useful way. Now we have the following commands
config read config_file
reads the named config file, returns 1 if successful, 0 if not
config find_first option
returns the address of the named option in the config tree. This is something you use in other config commands.
config find_next address ?option?
returns the address of the next option after the one whose address is given as the address argument. If no option name is specified, it looks for an option with the same name as the one whose address is given.
config find_in address option
returns the address of the first option of the given name that is within the subtree headed by the node at the address given.
config value_of address
returns the value of the option at the address given.
(mq 2.2.0)
2-Feb-2000 Major change to mq. The syntax is now more like that of Tk widgets:
read_config config_file
This command reads a standard config file, which will apply to all metadata read in this Tcl session. You can read only one config file, and you have to do it before you read any metadata.
metadata m -parse input_file
Here m is a Tcl variable name. On output it is given a unique value that allows mq to keep track of it. This command returns 1 if the parsing was successful, 0 otherwise.
$m find_first element_name
Here m is a Tcl variable previously passed to the metadata command above, and element_name is a standard or extended element name. This command returns the address of a matching element in hex, or zero. Use the address in subsequent commands.
$m find_in address element_name
Here address is the value returned from the find_first subcommand above or a similar subcommand below. If the given address is an element whose name matches the target element_name, the same address is returned. Otherwise it returns the address of the first child, grandchild, or more distant descendant node whose name matches the target element_name.
$m value_of ?-nonewline? address
If the address given matches a data element, this returns its value as a string. If you specify -nonewline, then it comes as one line, with each line separated by a single blank space. Otherwise each line is separated by the newline character (ASCII 10).
$m contains address
This returns 1 if the address corresponds to one of the elements in the tree, 0 otherwise.
$m name address
This returns the name of the element at the given address.
$m next address
$m prev address
$m parent address
$m child address
These subcommands return the address of the next, previous, parent, or child node in the tree, relative to the given address. Zero is returned if there is no corresponding element. These provide a way to walk through the tree manually.
$m forget
This frees all of the memory used for a metadata record. I cannot recommend using the Tcl command unset; it seems to take a long time to complete.
Note that this interface allows you to read and manage more than one metadata record at a time. If you're going to read a lot of records, you will probably want to use the unset command when you're done with each one.
(mq 2.1.0)
Jan-2000 Introduced a new program "mq", built from guts of mp and tkme. This program reads a Tcl script provided by the user and executes it. In addition to the standard Tcl language, the following procedures may be called:
read_config <config_file>
parse_text <input_file>
find_first <element_name>
find_in <address> <element_name>
find_next <address> <element_name>
value_of <address>
forget
find_first returns the address of an element in hex. This should not be modified, but should only be used as input to find_in, find_next, and value_of. forget causes the entire metadata record to be removed from memory, so you can read another record. (mq 1.0.0)
28-Jan-2000 Significant modification of memory management in all programs. In tree.c, the allocate_item() and deallocate_item() routines have been replaced with routines that allocate items in large chunks and manage the chunks themselves. This produces fewer calls to malloc() and free(), which are costly on some systems. Also in the process I think I found and fixed a number of less obvious bugs in the code, particularly in text.c. (2.5)
24-Jan-2000 Modified decode_text() in config.c and in local.c to explicitly look for and then disregard comments. Here a comment is any line whose first non-blank character is the pound sign (#). This change keeps comments from disrupting mp's understanding of the indentation of the config file and the extensions files, so that, for example, if you have
output
    html
# this is a comment; note that it is indented less than the
# element immediately above it.
      file %s.html
      faq %.faq.html
Then the "file" and "faq" elements will now be properly recognized as children of the "html" element. I think it used to consider them children of the comment, and thus were silently ignored. I bumped all minor version numbers by one for this fix. (2.4.40)
21-Jan-2000 Modified decode_xml() in xml.c to increment line_number at the right time, and not at the wrong time. Error messages parsing XML should now be more logical. Thanks to Frank Roberts for pointing this out. (2.4.39)
6-Jan-2000 Modified xml.c to fix bugs in parser's handling of comments and unrecognized elements. One side effect is that now both XML and SGML tags will be case sensitive. XML is case sensitive, but SGML is not; this may cause trouble if there are any applications out there that input metadata to mp with uppercase tag names. (2.4.38)
5-Jan-2000 Modified config.c to recognize "stylesheet", "type", and "href". Modified xml.c to consult config for "output:xml:stylesheet"; if present, looks for "output:xml:stylesheet:type" and "output:xml:stylesheet:href". Puts the values of those strings into an xml-stylesheet processing instruction. (2.4.37)
5-Jan-2000 Fixed bug in xml.c in which UTF-8 characters were being decoded on input. I want to pass these through unmolested, keeping track of the input encoding, and output them with the same encoding. (2.4.37)
17-Dec-1999 Modified html.c to fix bug where, if map projection parameters were specified but no Map_Projection_Name was present, a null pointer was dereferenced. (2.4.36)
18-Nov-1999 Modified config.c to recognize element "info". Within output:info, a file element is used to specify the name of the info file that cns generates. (cns 2.3.2)
16-Nov-1999 Modified write_xml() in xml.c to not include a DOCTYPE declaration by default. Modified config.c to recognize the element "doctype". If specified under output:xml, the value associated with "doctype" will be output after the XML declaration. Typically that will be a DOCTYPE declaration, but you could really screw things up and use something else instead. If the config file has an empty element output:xml:doctype, then the default DOCTYPE is output.
Modified xml.c to not encode characters on output. When XML is input, the default encoding is assumed to be UTF-8. That can be overridden on input by specifying the encoding in the XML declaration. When plain text or SGML is input, the default encoding is ISO-8859-1. Either way, characters above &127; are output AS IS in XML, and the only characters that are encoded as entities are <, >, and &. (2.4.35)
4-Nov-1999 Implemented word-wrap in mp's text output. Added functions wrap_text() and wrap_subtree() to tree.c. Modified config.c to recognize config word "wrap". Modified write_text() in text.c to look for "wrap" within output:text, and if the value given is greater than zero, to wrap all text values to that width. For example,
 output
   text
     wrap 72
in the config file causes all text in the text output file to be wrapped in paragraphs so that it fits in the first 72 characters of the page. This all assumes 2-space indentation in the text output, which is the default. I also had to modify mp.c to call write_text() last among the output formats so as not to muck up any of the other formats. (2.4.34)
4-Nov-1999 Modified html.c to insert </dt> and </dd> tags. This may help in the rare circumstance that someone wants to edit the HTML output using an editor that understands HTML tags. (2.4.34)
7-Oct-1999 Modified mp.c to remove the code that replaces whitespace in output file names with underscores when running under MS Windows (as determined by defined symbol _WIN32). Created Makefile.vc for compiling mp and cns with Visual C++. (2.4.33)
15-Sep-1999 Modified decode_tree() in text.c to generate an additional hint to check the indentation when extraneous text is discarded. (2.4.32)
13-Sep-1999 Modified write_html_faq() in html.c to output additional information if some of the biological profile elements are present. (2.4.31)
9-Sep-1999 Modified check_extension() in syntax.c to improve the error message generated when an element name is encountered at the beginning of a line of text in another element's value. (2.4.30)
30-Aug-1999 Modified check_scalar_children() of syntax.c to clarify (?) the famous "reclassified as text" message. (2.4.29)
27-Aug-1999 Modified parts.ext and parts2.ext so that a Data_Set_Part can have other Data_Set_Part elements within it; this allows data set structure to be described hierarchically. Modified write_html_faq() in html.c to handle this hierarchical structure. (2.4.28)
27-Aug-1999 Modified local.c to add a new function extension_is_compound() that returns 1 if the integer given as the argument is the key value of a compound element, 0 otherwise. The simplest way to do this was to add a member to the internal extension_list structure. That member is an integer 0 or 1 that gets set to 0 by default and then gets set to 1 when a child element is recognized.
Modified check_extension() in syntax.c to properly handle element names at the beginning of a line of text in the text of an extension. Previously it marked these as errors and threw away the text following the element name. (2.4.28)
26-Aug-1999 Modified write_links() in html.c to look for an element called "text" within output:html:link in the configuration. If there is one, the value given to text, if present, becomes the lead text in the line containing links to the other formats. The default value of this text is "Metadata also available as". (2.4.27)
23-Aug-1999 Modified write_date() in html.c to correctly handle cases where a month is given but not a day in an ISO 8601 date. (2.4.26)
19-Aug-1999 Modified html.c to write the lead text in the links line as "Metadata also available as" rather than "Available as" and to not output a link to the current file. (2.4.25)
2-Aug-1999 Modified write_links() in html.c to better handle the case in which the input metadata are in directories other than the one from which mp is run. It was implicitly duplicating part of the file path, because the HREF given in the BASE tag contained some of the path elements also present in the value returned from related_file(). The problem shows up only if you specify a BASE tag in output:html and you don't specify a value for link_faq, link_html, link_xml, etc. The current code still won't work well if you specify a BASE tag but you try to put the various output files in directories other than the one that contains the input file. If you want to put alternative output formats in different directories, you must use the config options link_faq, link_html, link_xml, etc. (2.4.24)
27-Jul-1999 Modified decode_xml() in xml.c to free space allocated for the text; that was temporarily used by the parser but wasn't returned to the system.
23-Jul-1999 Modified xml.c to parse the whole XML input file at once rather than one line at a time. This permits start-tags that have attributes to span multiple lines in the input file. Note that the parser code still assumes that the input data are ISO-8859-1; this is not good because XML uses UTF-8 by default.
Modified config.c to parse the config file using copies of the functions that are used in text.c, illustrating the potential benefits of C++. (2.4.23)
15-Jul-1999 Modified write_links() in html.c to output the link even if the related file isn't generated when a value is specified for the elements link_faq, link_html, etc.
Modified upgrade() in upgrade.c to fix a bug introduced recently when I changed it to properly handle multiple Distributors within a Distribution_Information. (2.4.22)
14-Jul-1999 Modified html.c to encapsulate the code that creates links to related files in a separate static procedure write_links(). This procedure looks at the configuration for a section "output:html:link". Within that section, if there are any of the elements link_faq, link_html, link_text, link_sgml, link_xml, or link_dif, then a link line is written to the output HTML file. For each link_type element, if a value is given it is taken as the URL for the link to the related file, with %s in the value being substituted for the name of the input file, not including its extension or its path.
Modified check_extension() in syntax.c to provide the same slightly more informative error messages as allow() does when an element is unrecognized. (2.4.21)
12-Jul-1999 Modified parse_name() and write_date() in html.c to better handle cases in which (a) an ancestral suffix follows the name and (b) the year is followed by non-numeric characters. (2.4.20)
9-Jul-1999 Modified upgrade() in upgrade.c to fix bug in which the first Distributor was identified as being the first child of Distribution_Information. That isn't true if there's a blank line between them. So mp was creating an additional Distribution_Information for the first Distributor. Thanks to Scott Barnwell for pointing this out. (2.4.19)
8-Jul-1999 Modified write_citation() in html.c to output a colon following the title only if there is either a Series_Information or a Publication_Information.
7-Jul-1999 Modified write_xml_item() in xml.c to fix bug in element-ordering code where one of the grandchildren of an element could be output before its children. There may have been only one place this could occur, in Time_Period_Information where a Multiple_Dates/Times was used. (2.4.18)
23-Jun-1999 Modified xml.c to read and store attributes of XML elements read as XML. The primitive XML parser here still doesn't allow an element, including its attributes, to span multiple lines. This means that an element's start tag cannot begin on one line and end on another. This is not a severe limitation if attributes are not used. However, if attributes are allowed, this could easily become intolerable. (2.4.17)
22-Jun-1999 Modified parse_xml() in xml.c to do the same thing as the previous change did to parse_sgml(); when a blank line is encountered in the middle of a text value, insert a Wblank into the parse tree. (2.4.16)
21-Jun-1999 Modified parse_sgml() in sgml.c so that if a blank line is encountered in the middle of a text value, a blank element is added to the parse tree at that point. (2.4.15)
18-Jun-1999 Modified write_html_faq() in html.c to distinguish non-digital form a little more clearly. Removed the <hr> between Standard_Order_Processes. Fixed bug in output of Range_of_Dates/Times within Time_Period_of_Content in which write_html_item(q) was being called for each child of q, where it should have been called as write_html_item(r) where r loops through the children of q. Fixed a nearby bug where Ending_Date was being written with the prefix "Beginning_Date". (2.4.14)
28-May-1999 Modified config.c so that find_option() looks only at and below the node given as the first argument. This also required adding an overarching root node to the config tree so that find_option (NULL,key) would find things in the output subtree if the input subtree were present. (2.4.13)
28-May-1999 Modified xml.c to output the elements in the order given in actions.c This is necessary because although mp does not require elements to appear in any order, the XML DTD does. Of course it does this only because XML makes it needlessly difficult to specify a content model in which the order of elements is not significant but the elements themselves are. (2.4.13)
27-May-1999 Modified write_text() and write_text_item() in text.c to output an element prefix if one is specified in the config file. So if the config file contains output:text:prefix, then its contents will be output immediately preceding every element name in the text output. This would be helpful if a user wanted to allow an unsophisticated person edit the text file before re-processing with mp. The trick is that if you put a strange prefix like @@ before each element name, then you can run cns on the edited file, and this makes it easier for cns to distinguish an element name from a text value that starts with one of the element names. (2.4.12)
24-May-1999 Modified translate() in html.c to recognize "mailto:address" and process it the same way as it does "http://" and "ftp://", making the link a live hypertext link. (2.4.11)
6-May-1999 Modified xml.c to include an EncodingDecl in the XML declaration. Modified entities.c to supply a numeric version of ISO-8859-1 character encodings, so that it will be possible to automatically output &#160; instead of &nbsp; in xml output.
30-Apr-1999 Modified a variety of routines to remove unused variables and avoid potential use of uninitialized variables pointed out by gcc -Wall.
30-Apr-1999 Modified write_date() in html.c to correctly output dates where only the year is specified.
27-Apr-1999 Modified write_date() in html.c to output date as is whenever the first character of the value is not a digit. Modified parse_name() in html.c to output the name as is when the first character is '<'. These changes made it possible to run a modified template through mp to show how the elements are used to compose the FAQ-style output.
27-Apr-1999 Modified text.c to make some static procedures public. This enables Doug Bakewell's MetaMerge program to be more easily compiled with the distributed source code of mp.
20-Apr-1999 Modified upgrade.c to account for the possibility that someone might have more than one SDTS_Terms_Description, and within each, more than one SDTS_Point_and_Vector_Information.
14-Apr-1999 Fixed several missing-brace errors in write_html_faq() and write_dif(). These caused seg faults when running the template, because the template has no actual data.
7-Apr-1999 Modified html.c to include the output file name in the BASE HREF.
6-Apr-1999 Modified config.c to accept keyword dif under output. This was a long standing oversight. Modified config.c to accept keyword base under output:html. The value given to this is a URL that will be put into a <base href="url"> tag near the top of the head element in html output. This causes relative links to work when the html document is retrieved through the clearinghouse (otherwise those links are relative to the clearinghouse gateway, usually not what you want). Modified the links to alternative formats given in the html files; these should now work as intended.
5-Apr-1999 Modified write_html_faq() in html.c to print Non-digital_Form info when that is part of the Standard_Order_Process. Modified the question for which Logical_Consistency_Report is the answer to make it more compatible with topological information. Modified main() in mp.c to remember the names of the output files. Each output file is stored along with its type in a private structure. Modified html.c to output below the data set title a line indicating the other formats in which the metadata are available. Modified write_html_faq() in html.c to output dates in a friendlier format if it is possible to do so.
2-Apr-1999 Modified write_faq_html() in html.c to handle the various ways in which an attribute's Enumerated_Domain elements may be nested within its Attribute_Domain_Values elements. The new method puts them all into a single table (one per attribute). Each Range_Domain, Codeset_Domain, or Unrepresentable_Domain gets its own table.
2-Apr-1999 Modified parse_name in html.c to cancel the parse if the first name is the string "U.S."
Modified some of the questions output in FAQ-style HTML for clarity.
1-Apr-1999 Modified translate() in html.c to encode high-bit characters using the long names (for example &eacute; rather than the character 0xE9. Added a META tag at the beginning of the HEAD element to specify encoding using ISO-8859-1. I know these two actions are probably contradictory, but maybe it won't hurt, and anyway we can always undo the first change and stick the character codes in as is.
1-Apr-1999 Modified upgrade.c so that when it changes the value of Metadata_Standard_Version to FGDC-STD-001-1998, it also deletes any additional children of Metadata_Standard_Version. This occurred when people had multiline responses for Metadata_Standard_Version, which is caused by wrapping the element's text to the next line. Thanks to Susan Stitt for pointing this out.
29-Mar-1999 Modified many of the C source files so that they do not explicitly declare the functions stricmp and memicmp, but instead include a new header file stricmp.h that declares these functions. The header file stricmp.h wraps the declarations in conditional code. If the system on which you want to compile already has the function stricmp, for example, then define the symbol HAVE_STRICMP in the CFLAGS of the makefile. Likewise if your system already has memicmp, define HAVE_MEMICMP.
Also I modified the declaration of memicmp to match that given in the mingw32 headers; specifically I made its pointer arguments to be of type const void * and its integer argument to be size_t instead.
This was all motivated by my discovery that I could compile using the cygwin tools on Windows NT but specify -mno_cygwin and thereby not need to pack the cygwin1.dll into the distribution.
16-Mar-1999 Modified write_html_faq() to output the Format_Information_Content using write_html_value() rather than simply munge(). Modified write_citation() to include the Online_Linkages, if present, in an unordered list.
16-Mar-1999 Fixed minor bug in write_html_faq() in which the internal links to labels how.1 and how.2 were not coded with the #, making them external instead.
8-Mar-1999 Modified html.c to correct a brace nesting problem near line 1879.
3-Mar-1999 Completed function write_html_faq() in html.c to generate plain-language output in HTML format. This output format will likely be refined in the near future, but the basic idea is to write the metadata into an HTML file according to a series of plain-language questions.
25-Feb-1999 Modified sgml.c to recognize the entity "&break;" while parsing sgml. When it finds this entity, it will add a Wblank element after the text in which the entity was found is converted into a Wunknown. So if the line contains nothing beyond the "&break;", the Wblank will be inserted where the "&break;" is. If any text follows the "&break;" on the same line, the Wblank will be inserted after that text.
This allows people who are using sgml as input to specify where blank lines should occur within text values, since in sgml blank lines are normally ignored.
17-Feb-1999 Modified html.c to add a new function write_html_faq(), which like the function write_html(), writes an HTML output file. The "faq" version casts the metadata in the form of a FAQ list. FAQ here stands for Frequently Anticipated Questions rather than Frequently Asked Questions, since we have no way to know whether anybody will actually ask these questions, but at least we think we can answer them! ;-) At this writing (19990217) it works on all the examples but doesn't handle more than a few of the questions, and so leaves out quite a bit of the metadata. Expect changes soon.
14-Dec-1998 Modified xml.c to output the entire XML declaration in lower case. Thanks to Joel Register for suggesting this.
10-Dec-1998 Modified translated() in html.c to recognize a right parenthesis as the end of a URL.
8-Dec-1998 Modified config.c to recognize the keyword "ext" under "input". That element, which may be repeated, is used to indicate the file extension used for the input metadata file. This allows the user to expand the set of file extensions beyond the default set mentioned in the previous process step.
7-Dec-1998 Modified mp.c to strip common extensions off the input file name when composing output file names from templates specified in the config file. This means that if you specified in your config file:
output
  errors %s.err
  html
    file %s.html
and process the input file "stuff.txt", the errors will be put into a file called "stuff.err" and the html output will be put into a file called "stuff.html". The file option can be specified for html, sgml, xml, text, and dif output, and has the same effect in each case.
The extensions that will be stripped from the input file name are currently only the following, upper or lower case:
.txt
.sgml
.sgm
.xml
.text
.met
.bin
30-Nov-1998 Modified config.c to include option "prune" within "input". This causes mp to prune the whole tree after fixdoc and upgrade. Not that anyone should do prune in combination with fixdoc, but that's where it is in the code. Default is not to prune.
24-Nov-1998 Modified upgrade() in upgrade.c to avoid following bad pointers when decoding G-Ring polygon data. This bug arose when users ran mp on the template, which has no data.
21-Nov-1998 Modified syntax.c to allow free text in Address_Type as per CSDGM v2. Thanks to Matthew Skala for pointing this problem out.
19-Nov-1998 Modified local.c to properly handle cases where more than 64 extensions are added (bug fix--thanks to Matthew Skala for finding this). This bug affected mp, cns, xtme, and tkme, so I have incremented the versions of all of these to 2.3.
27-Oct-1998 Created a new module fixdoc.c designed specifically to fix some specific problems created by DOCUMENT. Used in conjunction with a set of local extensions, assuming also that cns has been run on the output of DOCUMENT FILE using the same local extensions (and some aliases). This is a fairly complicated procedure but one that I hope will save people from a lot of aggravation as they try to recover the information they entered using DOCUMENT.
27-Oct-1998 Modified find_key() in tree.c to search only the given node and its children, not its siblings.
20-Oct-1998 Modified syntax.c to fix bug in code that reports errors in composition of Digital_Transfer_Option. Thanks to Kerie Hitt and Curtis Price for finding this bug.
19-Oct-1998 Modified mp.c and cns.c to not assign variable "out" to stderr until runtime; in cygwin32 stderr is not a constant.
30-Sep-1998 Modified syntax.c to correctly flag errors in Multiple Dates/Times (was looking at number of Calendar_Date within this element rather than number of Single_Date/Time within this element). Thanks to Kerie Hitt for pointing this out.
29-Sep-1998 Modified sgml.c and xml.c to print newlines for blanks within text values by default, but only when the blank lines occur within the body of the text value, not when they occur at the end of the value, and not when a blank line occurs between elements.
28-Sep-1998 Created xml.c from sgml.c; modified mp.c to read and write XML using code in xml.c.
28-Sep-1998 Modified upgrade.c to print out slightly more verbose informational messages when adding elements to upgrade the metadata to CSDGM v2.
17-Sep-1998 Renamed metadata.dtd as csdgm1.dtd, copied into file csdgm2.dtd. Modified csdgm2.dtd to reflect syntactical structure of version 2 of the CSDGM. Added a file "catalog" to relate public identifiers to the SGML files included with the software, tested with nsgmls on one metadata record. Updated sgml.c to produce a DOCTYPE that refers to the version 2 DTD.
9-Sep-1998 Modified text.c to test whether top_level has an argument rather than simply to use the value in a stricmp, which caused core dump when the value was not present. Thanks to John Heuer for pointing this out.
26-Aug-1998 Modified syntax.c to use CSDGM version 2. Added upgrade() in upgrade.c to automatically restructure those portions of the metadata that need to change to fit version 2. Added version strings in revision.c, declared in revision.h. Set version number to 2.0 for mp, cns, xtme, tkme, and stomp.
24-Jun-1998 Modified several points within dif.c to fix bugs that caused it to crash when generating dif records for the examples.
1-May-1998 Modified write_html() in html.c to place the name tag around only the <hr> at the top of each major section. The concern was that poor parsers of HTML might not properly match the </A> with the most recent <A ...> tag, which would result in an error where links appear within major sections. Thanks to Curtis Price of USGS WRD for pointing this out.
16-Mar-1998 Modified dif.c to conform to version 6 (19980202) of the Directory Interchange Format. Thanks to Lynn Halpern (STX) for prompting and assisting in this upgrade.
11-Mar-1998 Modified html.c to fix bug (found by Susan Stitt) in which the order of options under output:html was significant (the first of {preformat, meta, translate} was interpreted properly, the others ignored).
25-Feb-1998 Reordered the Process_Step elements in this document to be monotonic with time.
20-Feb-1998 Fixed minor bug in text.c. If a textual value began on the same line as the element name but that line was followed by a blank, it always discarded the text following the element name. Workaround was to close up the blank line. It now looks foward in the list of elements to the next non-blank element; if that is Wunknown or EOF, the text following the element name is retained as part of the value of the element. If the next non-blank line is a recognized element, the text following the original element name is discarded with a warning.
17-Feb-1998 Modified html.c to translate bare URLs into live links. This recognizes ftp://url and http://url as forms of urls as well as the older format <URL:theURL>. Note that <http://theURL> will now become live as well.
14-Jan-1998 Modified key_of_ps8_tag() in ps8.c and key_of_astm_tag() in astm.c to call a new function extension_of_sgml() in local.c that returns the key of a local extension. This causes mp and stomp to recognize extensions in sgml input as well as in text input, using the same mechanism. Thanks to Lisa Peoples for helping to find this bug.
12-Nov-1997 Modified write_html_item() in html.c to not display the preformat indicator character in preformatted sections.
6-Nov-1997 Modified write_html_item() in html.c to simplify code and properly preformat all sections preceded by >.
31-Oct-1997 Modified config.c to recognize preformat and meta elements under output:html. preformat causes <pre> tags to enclose sections of textual values whose lines all begin with >. To use a different character, give it as the value of preformat. It is on by default, so specify "preformat off" to disable this feature. meta enables the generation of dublin-core meta elements in html output. It is on by default. To disable dublin-core metadata, use "meta off".
12-Sep-1997 Modified keyword.c, config.c, and local.c to avoid overflowing the local variable string when trying to recognize an element name.
21-Aug-1997 If you specified the html prefix or suffix for an element value but not for the element name, it didn't use the default element name prefix and suffix, it used nothing. In this change html.c was modified to use the default name prefix or suffix if the output:html:element:name: prefix or :suffix are not specified. If ...:name:prefix is specified but is blank, then the default name prefix is not used. This allows you to specify that no prefix be used even if there is a default prefix (same for suffix).
8-Jul-1997 Modified parse_sgml() in sgml.c to not output the warning about extraneous text if the extra text is entirely composed of whitespace.
10-Jun-1997 Modified print_item() in tree.c to fix bug--was missing the variable to be printed in one of the printf calls.
19-May-1997 Modified parse_sgml() in sgml.c to fix bug in which unrecognized SGML tags were not processed properly. Changed parse_sgml() to permit execution to proceed when unrecognized tags occur. Added code to decode ISO 8859-1 entities in incoming SGML text.
7-May-1997 Modified parse_sgml() in sgml.c to not aggregate lines in the input. Modified write_html_item() in html.c to use <dd> for single-item data values of length 64 or greater. Previously these would be put into the <dd> tag. Modified translate() in html.c to use a managed dynamic buffer rather than a static array.
1-Apr-1997 Modified write_html() in html.c to allocate more scratch space for the title. This caused rare crashes that could not be easily anticipated or duplicated because they depended on the granularity of heap space, on the number of lines in the title, and on the lengths of the title lines.
17-Mar-1997 Modified write_contact_info() in dif.c to use more stack space for lname, mname, and fname (was 32 bytes in each case).
30-Jan-1997 Modified decode_tree() in text.c to allow scalar elements to begin on the line that contains the element name.
8-Jan-1997 Modified keyword.h to define the values of enum fgdc_keyword, so that binary files will have more of a chance at portability. Modified config.c to recognize the element "binary" in the config file. Modified mp.c to read and write binary files on request. Added module binary.c to carry out encoding and decoding of binary files.
28-Oct-1996 Modified config.c to understand element 'body' in output:html; the text given for 'body' will be appended to the <body> tag in the html output, allowing the user to specify background color for the html.
17-Oct-1996 Modified config.c to fix bug in unify_strings() where allocated block dst was not initialized before a call to strcat (crashed on Linux).
Modified sgml.c to output Eric's suggested public identifier for DTD 1.0.
30-Sep-1996 Modified syntax.c to classify errors and maintain counts of six different kinds of errors: unrecognized elements, misplaced elements, missing elements, superfluous elements, empty elements, and elements with the wrong sort of values. Modified mp.c to write a one-line report to stderr showing the number of each type of error.
Fixed a bug in syntax.c that caused the less-than-intuitive error message "(unknown) is not permitted in <element>". The new text is more informative.
27-Sep-1996 Modified dif.c to output a newline where Wblank appears in the parse tree rather than "(blank):"
26-Sep-1996 Modified config.c to recognize options output:html:header and footer. Modified html.c to output text of header before the title in the body of the html, and to output text of footer after the "generated by mp" line at the end of the html.
18-Sep-1996 Modified sgml.c to swat bug. If a configuration file was used and the output:sgml:blanks was also used, a seg fault could occur.
7-Sep-1996 Modified sgml.c to include a rudimentary SGML parser. It has essentially no flexibility and no error recovery.
Modified mp.c to use this SGML parser if the input file's name ends with .sgml or .sgm (case not sensitive).
6-Sep-1996 Modified syntax.c to not test blank lines in scalar values.
26-Aug-1996 Modified syntax.c to limit the number of Enumerated_Domain to one per Attribute_Domain_Values. This bug spotted by Gerry Daumiller.
9-Aug-1996 Modified text.c to output {single scalar followed by blank} the same way as {single scalar}; immediately following the element name. Modified syntax.c to strip enveloping quotes from scalars that have restricted domains.
6-Aug-1996 Modified config.c to recognize and html.c to use the configuration element output:html:element:value:obeylines which, when present, causes html.c to append a line break <br> to each line of the value of the specified element.
6-Aug-1996 Modified html.c to remove the default formatting of elements. These should be controlled by a configuration file. Also modified this file so that it correctly recognizes when no value is given as the default prefix or suffix of element names.
6-Aug-1996 Modified all source files containing #ifdef's so that these preprocessor directives occur only at the beginnings of lines, to work with non-standard compilers such as the one supplied with OSF/1. The ANSI standard came out in 1987. Haven't these vendors *read* that document by now?
19-Jul-1996 Modified equalize_indented_scalars() in text.c to avoid writing into p->next->prev when p->next is NULL. Modified sgml.c to avoid warning on some systems casting char * to unsigned char *.
5-Jul-1996 Modified main() in xtme.c, mp.c, and cns.c to read more than one local extensions file. This should enable people to choose more carefully which extensions will apply to a given input file.
29-Jun-1996 The code described in the previous process step has been excised. Instead, the function equalize_indented_scalars was added to try to fix up textual values that have indentation. The basic problem is that you don't want blank lines to be parents of text values, nor for that matter do you want text values to be parents of text values. This code is not comprehensive, and probably needs more work.
23-Jun-1996 When a line in a text value was indented relative to those preceding it, the line was correctly recognized by the parser as being a part of the textual value. But when such an indented line immediately followed a blank line, the indented line was considered a sibling of the blank and a child of the preceding text line. This caused the line to be omitted from the output. Not good. The fix is to modify check_unknown in syntax.c so that this case is recognized and the topology is rearranged to fit the situation.
I'm not sure whether this is the right place to fix this problem. I think we could also fix it in the parser, by assigning the indent value of lines that follow blanks differently if the preceding nonblank line has key Wunknown. This would require more sophisticated look-back at that point in the parser, however, and I don't know whether it would correctly handle more complicated situations.
21-May-1996 Modified html.c to fix bug in which default prefix for element names was taken to be "prefix", and default suffix "suffix". Thanks to Hugh Phillips for pointing this out.
5-Apr-1996 Modified astm.c to remove the information about z39.50 numbers. Added the module z3950.c to carry this information appropriately, and modified local.c to allow users to include in the description of extensions a characteristic z3950 whose value is the numeric tag assigned to the element.
28-Mar-1996 Modified html.c and config.c to allow new syntax for specifying the prefix and suffix tags of element names and element values in html. The new syntax is
output
  html
    element
      name
        prefix <html code>
        suffix <html code>
      value
        prefix <html code>
        suffix <html code>
This allows specific html code to precede and follow both the element name and the element value. Created a new file called deluxe.cfg in doc that shows how this new syntax can be used to link every element name back to the correct section of my hypertext rendition of the standard.
15-Feb-1996 Modified config.c to correctly recognize the component "top_level" of "text"; this controls whether the top-level Metadata element is preserved on output or omitted.
1-Feb-1996 Modified syntax.c to permit more than one Entity_and_attribute_Overview in an Overview_Description. Thanks to Chuck Stein for pointing out this bug.
1-Nov-1995 Modified mp.c to remove the generic tree-handling routines to tree.c, which was added to the Makefiles.
26-Sep-1995 Modified sgml.c to make 8-character tags the default, as specified in the GEO attribute set for Z39.50.
14-Jul-1995 Modified config.c to allow the keyword skip_extensions under output:sgml. Modified sgml.c to look for this keyword in the configuration info. If present, then elements that are not part of the 19940608 CSDGM will not be included in the sgml output.
11-Jul-1995 Modified ps8.c to reflect the following tag name changes:
 tempkeyt -> tempkt
 accscons -> accconst
 accsinst -> accinstr
 columns  -> colcount
 vertcnt  -> vrtcount
Rebuilt makedtd and rebuilt ps8.dtd.
30-Jun-1995 Modified sgml.c and html.c and config.c to allow users to specify a string that will be output wherever blank lines occur in the input file. The default for SGML is ""; the default for HTML is "<P>\n". This option is specified by putting the keyword blanks under the keyword sgml or html under output in the configuration file.
30-Jun-1995 Created local.c to replace extend.c for handling local extensions to the standard. This module provides a mechanism for user-specified element names, with corresponding SGML tags, to be handled properly by the parser and by the code generators. Syntax checking is relatively primitive, and is based a parent list and child list associated with each local element. Essentially this means that local extensions are always optional and repeatable and their children, if any, are always optional and repeatable.
27-Jun-1995 Created config.c and config.h to handle configuration issues through a configuration file, containing key words in the same general form of the metadata. Modified mp.c, text.c, html.c, and sgml.c to consult the information contained in the config file.
26-Jun-1995 Modified text.c to handle blank lines in a more logical and comprehensive fashion. Blank lines are now assigned indentation prior to the disruption of links that forms the overall parse tree. Indentation assigned to a blank line is the larger of the indentation of the previous non- blank element or the next non-blank element in the file. This ensures that blank lines will not have children (if they did, the children would not appear in the output) and only occur as members of lists.
26-Jun-1995 Modified html.c to correctly skip the Metadata tag at the top of the tree. Added code to correctly handle ampersands and double-quotes.
23-Jun-1995 Added code to text.c, syntax.c, html.c, and sgml.c to handle blank lines. Modified item structure to include a prev pointer within compound types.
22-Jun-1995 Added functions in html.c to handle translation of reserved characters in html output. This converts <URL:theURL> in the input to <a href="theURL"><tt>&lt;URL:theURL&gt;</tt></a> in the output, thus activating URLs embedded in the metadata. It also converts other occurrences of < and > to &lt; and &gt; respectively. This is not necessarily desirable in all cases; there is an internal variable called do_text_translation that controls whether or not this gets done. If do_text_translation is set to 0, textual values are conveyed to the html file as they appear in the input file. If do_text_translation is nonzero, textual values are translated as described.
10-Jun-1995 Created module extend.c to handle lookup and translation of local element names (i.e. element names not part of the FGDC standard). Modified keyword.c to allow the functions key_of and text_of to return valid data when extensions are given as their arguments. Modified astm.c and ps8.c to report as the sgml tag of an extension the tag name returned from text_of_extension() in extend.c. This is not entirely satisfactory, because extensions might be structured like FGDC keywords, i.e. a long form for textual reports and a short (8- or 10-character) tag name for sgml.
31-May-1995 Modified mp.c to add functions for node handling:
 void deallocate_item (struct item *p);
 struct item *insert_item_after (struct item *r);
 struct item *insert_item_before (struct item *q);
 struct item *add_child (struct item *p);
and added comments explaining these functions.
Added code to main in mp.c to insert a parent Metadata node if one is not already present. This causes the syntax checker to report missing major sections that are required.
26-May-1995 Modified text.c to output a warning if the indentation is ambiguous, as in
A:
  B:
      C:
    D:
Here the parser assumes that D is a member of A, not of B. But the user might have intended D to be a member of B. The parser cannot tell, so it issues a warning.
26-May-1995 Modified text.c to not output the error message about too much indentation. The code already functions correctly for files that are indented consistently within sections.
25-Apr-1995 Modified syntax.c to issue warning rather than error when a scalar value is missing, and to include text of unknown keywords in error messages.
3-Apr-1995 Modified astm.c to include tags for map projection names taken from Doug Nebert's DTD:
 ALBERSCEA
 AZIMUTHAL
 EQUIDISTC
 EQUIRECT
 GENERALVNP
 GNOMONIC
 LAMBERTAZ
 LAMBERTCC
 MERCATOR
 MILLER
 MODSALASKA
 OBMERCATOR
 ORTHO
 POLYCONIC
 PSTEREO
 ROBINSON
 SINUSOIDAL
 SPOBLMERC
 STEREO
 TM
 VANDERG
Excised sgml tags from sgml.c; transferred the eight- character tags to a new file called ps8.c. Choice of tags to use in generating SGML output is governed by a local variable called do_astm in sgml.c; currently this is always set to 1, causing the astm tags to be used.
20-Mar-1995 Modified sgml.c to use eight-character tags that are different from the ASTM tags. The ASTM tags cannot be used in sgml because many are ten characters long, and SGML names are restricted to eight characters.
Created a new module astm.c containing the ASTM tags, Z39. 50 numbers, and the code to go between those and FGDC keywords. This module is not currently linked into the executable.
23-Feb-1995 Modified html.c to always output two spaces after the </em> tag in a keyword.
22-Feb-1995 Modified html.c to cope better with titles that span more than one line. The separate lines of the title are concatenated for both the document title and the top-level heading.
23-Jan-1995 MP was used to process metadata for the opening of the USGS node of NSDI.
Generated by whatsnew.tcl 1.2 on 04-Dec-2014 11:51

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://geology.usgs.gov/tools/metadata/tools/doc/metadata/mp-revision.html
Page Contact Information: Peter Schweitzer