Editorial: Can I limit the length of metadata fields?

A user outside USGS asked:
[I'm creating a database to hold the metadata. Some fields allow free text.] I would need to allow quite a bit of storage space to accommodate the possibility that someone would enter more than 255 characters for, say, Temporal Keyword Thesaurus. I would rather use my judgement and set up the front-end application to enforce a size restriction determined by a review of the probable nature of entries for each data element. I am hesitant to disallow anything the standard allows. However, some choices need to be made.

Reply by Peter Schweitzer on 27 Nov 2001:

Yes, even though the standard makes no restrictions on field lengths, it makes sense for an implementation to do so for practical reasons. The trick is to avoid unreasonable restrictions. For example, the old DOCUMENT.AML for Arc/Info, written by some folks at USGS long ago, didn't allow you to put more than about 40 characters in the Title. That was nutty, of course--I think in the >800 records on geo-nsdi there aren't more than one or two that have short titles. However it's not unreasonable to say the title should be less than 256 characters. It ought to be pretty easy to read, since it will show up first in searches and will be shown in fairly large type on HTML renditions of the metadata. So while the standard doesn't say how long the field can be, practical concerns make a compelling case for reasonable limits.

I do think it's important to let some of the narrative text fields be quite a bit longer. 256 characters is way too short for an Abstract or even a Purpose, and while the quality reports are often short, they really shouldn't be, so they should allow long texts. Process_Description is another field that needs to be long, as is Entity_and_Attribute_Overview, if that is used.

But many of the text fields can be 256 or less. For example, it would be a mistake to allow people to put lots of text into Online_Linkage; that really ought to be just a URL now, even though in 1994 when the standard was written it wasn't clear how well-known URLs would become.

So I think you're on the right track. Impose sensible limits and encourage your users to let you know when they're tempted to break those limits--it could mean they don't understand some part of the standard and are trying to shoehorn some information that really belongs somewhere else. For example, I sometimes see people putting details of attributes into Entity_and_Attribute_Detail_Citation. What should go there is a reference (best a URL) to a published data dictionary. By limiting the amount of info people can enter, you might actually help channel their thinking in the right direction.