The date time spec has come out of the bibliographic community, which has the need to record many different kinds of dates and have information about their certainty. Dates support a variety of functions and I will name some, although this is not to be considered comprehensive.
*discovery: to find resources of a certain date or date range or era
*identify and select: to identify that the resource you are looking for is the correct one. This applies to people and organizations as well (e.g. the John Smith I'm looking for was born in the second decade of the twentieth century or was active somewhere around 1720 or 1730).
*obtain: in a resource that is published over time (e.g. a journal) to be able to obtain the appropriate issue/article when that is known (e.g. I want the issue published during the second week of June in a biweekly publication)
*timelines: to provide a view of historical material through a timeline, for instance a time interval corresponding to some event, e.g. the Civil War (1861-1865).
*management of bibliographic resources: to ascertain that the issue of a journal was received when it was expected, knowing its publication pattern (e.g. it is always published on the 3rd Thursday of the month).
As we move into XML formats, many of these dates need to be expressed in structured forms so that they are unambiguous and can be validated or otherwise processed by computers. In many cases there are uncertainties and those need to be known for various purposes.
I have been involved in discussions over many years that have concluded that W3CDTF does not adequately cover in enough detail the kinds of dates that are needed in cultural institutions. Often those noted are: uncertain, approximate, time intervals where the start or end are not known, time intervals where the end is open (i.e. for a publication such as a journal that is continuous). If expressed in an unambiguous standard form then the date can be validated. The situation that was needed when we first developed this proposed spec was the open date. The situation was in the PREMIS data dictionary for preservation metadata, where we wanted to express that the rights for use or preservation of a particular resource were good beginning on a set date, but there was no end date applicable (i.e. it was an "open" time interval). We wanted to give the end date as "open" and be able to use that value for processing, e.g. find all resources with open dates to contact the rights holder to verify that it was still legal to use them. In the same way, we might want to know which dates are uncertain so that there is a trigger to reevaluate them later on. XML metadata formats in the bibliographic community that have elements for many kinds of dates include PREMIS, MODS, METS, EAD and MARC (when expressed in XML).
For identification purposes you may need to know what the possible dates of birth or birth are, and if you don't know for certain what dates of birth or death might be. For instance you may know that someone was born in either 1825 or 1826. Or you may have a resource that gives the birth or death date in the Hebrew calendar and that would translate to the Gregorian calendar as one of two years. For research purposes and to establish that person using a controlled form of name you need to know this information.
Another point is that the library and archival communities have rather complex rules for describing people, organizations and bibliographic resources, all of which support various user tasks. Some of the complex rules involve being as precise as possible when one unambiguous date is not known.
Suffice it to say that in managing access and use of bibliographic materials and creating and developing files of authoritative data about people, organization and resources, there are many dates that are needed in as accurate a form as possible.
Rebecca
Rebecca S. Guenther
Senior Networking & Standards Specialist
Network Development & MARC Standards Office
Library of Congress
101 Independence Ave SE
Washington, DC 20540
voice: +1.202.707.5092
fax: +1.202.707.0115
[log in to unmask]
-----Original Message-----
From: Discussion of the Developing Date/Time Standards [mailto:[log in to unmask]] On Behalf Of C. M. Sperberg-McQueen
Sent: Tuesday, January 11, 2011 1:14 PM
To: [log in to unmask]
Subject: [DATETIME] comments on draft EDTF spec of 5 Nov 2011
Comments on "EDTF Specification DRAFT FOR REVIEW" of 5 November 2010
(http://www.loc.gov/standards/datetime/spec.html)
...
3 What problem is being solved here? What are the requirements?
The goals and requirements of the work are not clear (to this reader, at least) from the document. The section 'Background' begins with the promising statement
No standard date/time format meets the needs of XML metadata
schemas.
But the document does not seem to provide any list of what its authors believe the needs of XML metadata schemas are, or why no existing standard date/time format meets them.
The Web page "Problem, Requirements,and Basic Approach", at http://www.loc.gov/standards/datetime/requirements.html, does include the heading "Requirements", but the text beneath that heading just lists a number of concrete proposals for functionality and syntax. It does not identify user-level requirements rooted in a particular application domain. Without a better sense of what needs must be met by the EDTF format, it's difficult to say whether the current document is meeting them or not.
At the risk of flogging a dead horse, perhaps a few concrete examples may help clarify the issue I'm raising. The "Problem, Requirements,and Basic Approach" page says, by way of elaborating on the proposition that xsd:date, xsd:time, and xsd:dateTime are inadequate:
The string 2001-02-03, for example, is a valid xs:date value, but
20010203 (without hyphens) is not, even though it is a valid ISO
8601 [6] date. This is a choice that W3C made when defining
xs:date - the hyphenated form was chosen and the non-hyphenated
form excluded.
All of this is true (or would be true if "is" were replaced by "denotes", see comment 9 below), but it does not on its face present an argument leading to the conclusion that the XSD datatypes are inadequate. The string "10 January 2011" is also not a valid xsd:date, nor is the string "possibly late in the reign of Diocletian?". The number 3.141592 is also not a valid xsd:date. Nu?
To make a requirement for defining a hyphenless form of date, you need (or so it seems to me) to identify something that can be accomplished with such a definition, that is impossible otherwise. Something at the domain level, I mean, something other than "representing dates in a form without hyphens". And, given that the opening statement of the problem refers to XML-based metadata vocabularies, the something should probably be related, somehow, to existing or proposed metadata vocabularies.
The "Problem, Requirements,and Basic Approach" page says further
Many dates are coded in database records without hyphens
(conformant with ISO 8601). When extracting a date from a database
record to insert into an XML record, some implementors feel it is
an unnecessary burden to have to insert hyphens.
This seems a less than compelling argument. On the scale of format-conversion difficulties, trivial string manipulations like this one hardly register compared to other challenges caused by mismatches in how the information is modeled. (And in the SQL database management systems I'm familiar with, it would not be correct to say that dates are stored without, or with, hyphens; in all current implementations of SQL dates are as far as I know stored in compact binary forms and translated to character strings only upon export or display.)
The eleven specific items listed under "Requirements" don't have any overt reference to metadata vocabularies, though it seems clear that metadata vocabularies will need to record publication dates (for
example) which are uncertain, questionable, or for which only one end-point of a range is known. I think the document would be stronger if these items were motivated by concrete examples rooted in the application domain.
For almost all of the items in the features table, the question arises "when and how is this form needed in XML-based metadata formats?"
A few examples may be worth calling out.
205 Year and ordinal day. Why is this needed? If the requirement is to record a particular date, that date can be recorded in yyyy-mm-dd form, no? When does the requirement arise to record it in ordinal form?
207 Week date. Same question; when and why is it a requirement to record a date using this notation rather than the yyyy-mm-dd notation?
|