Hello!
Thanks for comments! Here, I reply with some clarifications and
suggestions for simplifications.
> Just as "00" means "00xx" where xx is any two digit number, "-00" would
> mean -"00xx" where xx is any two digit number.
I wonder whether
> "00" means "00xx" where xx is any two digit number,
matches our specification. Let's remember a couple of things. The century
note at
http://www.loc.gov/standards/datetime/spec.html#centuryNote explains that
> "Century" is explicitly undefined in this specification.
which I consider is really good! The specification also explains at #203
that
> 00 * first century
Thus, it seems that "00" would be "explicitly undefined" because the term
"century" is explicitly undefined.
> I'd prefer to just say that there is no disinction between negative and
> positive zero, they are both zero.
Do both "+00" and "-00" mean "00"? Does this apply to centuries? What is
the name of the century before the "first century" (noted "00"). If both
"+00" and "-00" mean "00", we should try to clarify how we handle
centuries around year "0001".
> "-00" would mean -"00xx" where xx is any two digit number.
This could lead to "centuries" lasting 199, 200 or 201 years (and the
like) if it is applied together with
> there is no disinction between negative and positive zero, they are both
> zero.
I would prefer to have a 100-year long "00" century called the first
century but still "undefined" as (0000 to 0099) or (0001 to 0100) or the
like. I would also suggest to choose to call the century before "00" with
the expression "-01" and thus skip "-00", but I would appreciate comments
from several people on the list.
The next question is whether the "-01" century should be called the zeroth
or minus-first or minus-second century - remembering that the "01" is
(probably) the second century according the EDTF specification at #203. I
don't really know which one is best and wonder what people on the list
think.
> there is no disinction between negative and positive zero, they are both
> zero.
The context may also be of great importance. Let's consider
zoneOffsetHour, where 2011-04-26T12:00:00+00:30 is not the same as
2011-04-26T12:00:00-00:30, for example. Thus, "+00" is not the same as
"-00" when the sign stands before a zoneOffsetHour.
> can you say with complete confidence that, say 2213-02-29 won't be a
> leap day?
No.
> in a hundred years leap day may be in November, so there could be a
> 2111-11-31.
Maybe.
None of us can be sure that 2213-02-29 or 2111-11-31 never will be valid.
The same (?) degree of uncertainity applies to these two dates. Why, then,
handle these two dates differently and validate one of them but not the
other one? Our specification validates 2213-02-29 but not 2111-11-31. I
think it is good that it doesn't validate 2111-11-31. My suggestion was
only to stick to the Gregorian calendar as it is defined today. As of
today, the Gregorian calendar is a usable approximation and it will
probably still be for a few thousands of years. This is why I proposed to
restrict "February, 29th" to leap years (as defined by today's Gregorian
calendar) and leave the question of post-Gregorian calendars to future
librarians.
> > > choiceListElement = date | date ".." date | earlier | later
> > choiceListElement = date "," date | date ".." date | earlier | later |
> > date "," choiceListElement | choiceListElement "," date
> Sorry, this one is not making sense to me.
My aim is to avoid choice lists with only one date, like this one
[2011]
> why do we even need yearMonth, when month alone would be sufficient.
I should have clarified my aim with that. In one way, it would be easier
to only use month, but for long periods, I thought it would be easier to
read /P123Y than /P1476M. As you suggest, it rises the question of c14n
but this may be solved by a reformulation such as:
monthsDuration = oneThru11 "M"
assuming that a Gregorian year always consists of 12 months. A c14n of
this kind would be much more difficult to formulate for daysDuration,
though, and for the simplicity of the BNF, I suggest to keep
daysDuration = positiveInteger "D"
> > > yearMonthDay "/P" daysDuration
> > yearMonthDay "/P" ( ( yearsDuration ( monthsDuration | "0M" )) |
> > monthsDuration )? daysDuration (* maybe *)
> I don't know if I agree here. […] not every month has the same number of
> days
Obviously, my suggestion needs to be clarified. We agree about the fact
that
> not every month has the same number of days
and that is exactly what I consider a use-case for the suggestion I
formulated. Let's consider
189u-01-26/P7Y
In such a case, we can not compute the number of days for the duration,
because we can not find out the number of leap days (February, 29th) this
seven years will include. The 7-year period beginning on 1891-01-26
includes TWO leap days (1892-02-29 and 1896-02-29), whereas the 7-year
period beginning on 1897-01-26 has NO leap day.
Likewise, the duration in
2011-0u-26/P1M
cannot be translated to a precise number of days. I consider therefore
that we need monthsDuration for reasons of accuracy. I also consider that
we need yearsDuration, but for a different reason, namely readability. As
noted above, it is easier to read /P123Y than /P1476M and a c14n of that
should be quite straightforward using oneThru11 as in the above suggested
definition of monthsDuration.
> what constraints there are on the components of duration needs more
> study.
This would include a distinction between dates without u's and dates with
u's, for example.
> > not allow non-integer years, such as 1.2345e3
This would lead to a year with a value space of 1234.5 which I consider
should be avoided.
> > longYear = "y" "-"? positiveDigit ( digit )* ( "e" yearExponent |
> > digit digit digit digit)
> positiveDigit ".e" yearExponent
> Will produce 1.7e8
I wonder what in today's BNF corresponds to the decimal "7" of this
example, when there is no room between the dot and the "e". Anyway, I
consider that the dot is error-prone and would prefer to write 17e7 or
170e6 for 170 million years.
> > choosing the [...] character [...] before the qualifier
> I don't see what is gained by doing that.
I thought of it as mnemonics (or clue) on what kind of qualification we
are dealing with, but this could be made more consistently. My aim here is
NOT TO DEFINE any new qualification, which is out of the scope of this
work. Instead, I consider it would be good to PREPARE for future
extensions. If "q" (for example) was used for "reserved strings" (that is,
strings defined by FUTURE specifications of EDTF), we could use "c" (or an
other letter) for "locally, user-defined strings" (not further defined by
future specifications of EDTF) and "^" for expanded names (with curly
braces around the namespace in accordance to the de-facto standard), for
example, such as
2011-21^"{http://www.example.org}some_definition_of_spring"
If we don't prepare for future specifications, there is an obvious risk
that people will put both human readable and (different kinds of) machine
readable information in qualifications if qualifications are only defined
as "strings", in which case this information will later need to be
"manually" reworked when future EDTF versions appear. Consistently chosen
characters before qualifiers would avoid a lot of later work.
The only exception among these non-definitions would be a reserved word
similar to:
q"Gregorian" (* or whatever qualifier character we choose *)
> > year = baseYear ("?" | "~") (* without parenteses around the year *)
> The intention is that "?" or "~" may be used without parenthesis when it
> applies to the entire expression and that parenthesis be used to apply
> it to a part of the expression.
Do years really need parentheses? As I understand that, the question mark
in
2011-(04)?
would apply to April, whereas in
2011-04?
would apply to the whole date. Parentheses are important for months. But,
could a question mark immediately following a year apply to somehting else
than the year? Are parentheses relevant for years? If there is a semantic
difference between
2011?
on the one hand and
(2011)?
on the other hand, or between
2011?-04
on the one hand and
(2011)?-04
on the other hand, I would appreciate a clarification about that in the
EDTF specification.
> > the numbering #317, #3171 and #316 in our specification
> I have wanted to retain original numbering
My aim was not to focus on four digits numbers. My suggestion was to move
#316 upwards by two rows to retain original numbering. There, at #316, I
would also suggest a little modification:
> as in the second and third example.
as in the second, third and fourth example.
During a rereading of the BNF, I noticed a few things I comment here:
> dateTimeString = | date [...]
dateTimeString = date [...]
When it comes to lists, I would appreciate a distinction between "dates
with x's" and "date without x's", which would allow for a more accurate
formulation. Furthermore, I wonder whether we could allow for
temporalExpression in lists and whether we propose any c14n within lists.
In the meanwhile, I suggest a simplification
> [...] (inclusiveListElement ",")* later | earlier (","
> inclusiveListElement)* "," later [...]
[...] ( earlier "," )? (inclusiveListElement ",")* later [...]
and a necessary reformulation
> consecutives | date ("," inclusiveListElement)+
( inclusiveListElement "," )* ( consecutives | date ","
inclusiveListElement | inclusiveListElement "," date ) ( ","
inclusiveListElement )*
This would thus allow for lists such as:
{2011..2013, 2015}
Such lists are excluded by today's BNF. I also wonder whether we could
clarify and formulate the reasons behind the SYNTACTIC differences between
inclusive lists and choice lists.
An other suggestion is also to reformulate the second example of #317,
otherwise, the BNF will grow in complexity if it must allow such
constucts.
> uncertOrApprox = date ("?" | "~")
uncertOrApprox = ( dateAndTime | longYear | yearMonth | yearMonthDay ) (
"~" | "?" ) ( qualifier )?
This would allow for uncertain or approximate long years and also avoid
constructs like
2011?? (* with double question marks *)
or
(2011)??
We could prepare for (without defining) the reasons behind the
uncertainity or approximation when combined with
qualifier = ( "q" reservedString | "c" userDefinedString | "^"
expandedName ) (* or whatever qualifier characters we choose *)
expandedName = "{" xs:anyURI "}" xs:NCName
reservedString = xs:token
userDefinedString = xs:token
Some simplifications:
> oneThru14 = oneThru13" | "14"
> oneThru23 = oneThru14 | "15" | [...] | "23"
oneThru23 = oneThru13 | "14" | "15" | [...] | "23" (* one line is enough
*)
> oneThru59 = oneThru31 | "33" | [...] | "59"
oneThru59 = oneThru31 | "32" | "33" | [...] | "59" (* "32" is missing *)
The two following ones are unused and could be deleted:
> zeroThru60 = zeroThru59 | "60"
> nonNegativeInteger = "0" | positiveInteger
Comments are welcome!
Regards!
Saašha,
|