Print

Print


On Jan 11, 2011, at 1:10 PM, Bruce D'Arcus wrote:

> On Tue, Jan 11, 2011 at 3:02 PM, C. M. Sperberg-McQueen
> <[log in to unmask]> wrote:...
> 
>> But both RDF and JSON have facilities for structured information; less
>> convenient, perhaps, for some things than XML, but still present.
>> Surely you don't want to suggest that either is incapable of handling
>> this information in a structured way instead of as atoms?
> 
> No, but:
> 
> a) it would then be completely orthogonal to the W3C/ISO formats in
> wide deployment, rather than an extension of it.

It depends, I guess, on how one takes the term "extension".  

If one creates an atomic datatypes whose value space and 
lexical spaces are supersets of those of the XSD date/time
datatypes, it's presumably an extension in one sense.

If one creates a complex type whose content (or one of whose
attributes) is a value of one of the XSD date/time types and 
whose other attributes include things like indications of uncertainty,
vagueness, etc., then that also seems to me to be an extension,
in some sense.

The latter has the property that

    <date when="2006-04-11" cert="low"/>

(to use the TEI representation for "2006-04-11?") has a simple
relation to the xsd:date value 2006-04-11:  the element represents
a date of 11 April 2006 with low certainty.  

It seems likely to be somewhat more difficult to establish a link
between the certain and uncertain forms of a given date, if a single
atomic datatype is created.  To explain why may involve more tedious
technical details than readers of this list will find interesting; skip
to point b) below if your attention flags.

The easiest way, by far, to create an XSD datatype which has
the EDTF formats as its lexical space would be to create a union
type including the existing date/time types and also one or more
restrictions of string which handle things like uncertain or questionable
dates, ranges, etc.  But in that case, the lexical form "2006-04-11"
maps to a date, while the lexical form "2006-04-11?" maps to a
string which has no particular relation to any date value. 

Depending on the uses envisaged, the presence or absence of
a clear relation between the date 11 April 2006 and the string
"2006-04-11?" may or may not be important.  

Another way to create a datatype whose lexical space is the set of
forms defined by EDTF would be for it to be implemented as an
additional primitive datatype, either by being integrated into the
XSD datatypes specification or by being included in one or more
implementations of XSD validators.  Because the value space of
primitive datatypes is defined by prose and not by any formalism,
the relation between "2006-04-11" and "2006-04-11?" can be
established by fiat.  But neither of those will be related to the xsd:date
value for that date, at the XSD level:  the value spaces of all XSD 
primitive datatypes are (again by fiat) pairwise disjoint.


> 
> b) creating atomic datatypes is relatively easy technically: there's
> an infrastructure for using those in standard XML (XSD, RNG, etc.) and
> RDF technologies, and you can use the exact same values across those
> contexts.  OTOH, creating complete models for different formats is a
> headache (do we really want to create an RDF vocabulary, and XML
> schemas in XSD and RNG, and something analogous in JSON?).

The technical difficulties of defining those formats seem small to
me compared with the task of achieving clarity on the meaning of
the formats.    Aiming at structured representations may have the 
advantage that it makes it harder to duck questions related to the
structure of the information.

> And it also raises a rather obvious question, which is why should this
> effort be any different than the W3C date-time format? E.g. why is it
> OK for the latter to be atoms, but not the former?

Fair question.

Since XPath, XQuery, XSLT, XForms, and other W3C technologies
use the type system defined by the XSD spec, the XML Schema WG
is chartered to work with the working groups responsible for those
specs and satisfy their requirements.  One reason to treat Gregorian
dates as atoms (instead of treating them as triples of day, month, and 
year) is that the kinds of forms for which XForms is designed 
typically want to treat them as atoms, not as tuples.  Another is
alignment with SQL, which has both date and timestamp types.
A third is that (as mentioned in my earlier note) the initial specification
of the value space of date (and all the other date/time types) made
clear that the value space was a set of segments on the terrestrial
timeline (in the case of types like gMonthDay and time, the segments
are discontinuous). The use of the Gregorian calendar was just a
notational convenience; the form of Gregorian dates has nothing 
essential to do with the nature of the value space.  (I should note
that later changes to the spec have destroyed this argument, by
redefining the value space as a set of tuples.  But by then, the treatment
of date, etc. as atomic types had already been established.)

I hope this helps.

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************