It was late ... I was tired ... and I've never been very good at arithmetic. For those reasons, I managed to forget to include "days' in my suggestion. Here's a corrected and properly formatted version of my previous post.
Ask me "What time is is?" and I'll tell you how to build a watch.
My apologies,
Tony Benedetti
Thu, 25 Jan 2018 01:32:55 -0500
Here he goes again! Please note that this is a suggestion for a subsequent ISO/EDTF -- NOT for the publication looming on the near horizon. ... Thanks, Tony
----------------------------------------------------------------------------------------------------------------------------
After observing the "seasons discussions" for a while, I've come to believe that humans will not
routinely be able to prepare or interpret any values other 1 through 12 in the "MM" segment when
dealing with a date. That is, unless they use a "cheat sheet" to decipher the meaning of "47".
I've also noted that the current schemes -- while trying to make dates usable for humans -- are problematic for computers. Notably, sorting dates is not possible without first checking to see if rearrangement of the data is necessary and then making any adjustments.
It seems that we are rolling a rock up a hill and are trying to serve two very different goals -- human
usability vs. predictably precise computer-to-computer data interchange.
Thus, I'm certain that there must be at least two representations of a given date. The first is an
internal encoding for the computers and can provide the necessary precision and predictability.
The other representations are for humans to consume and are appropriate for a given application
and audience. There are already several programs available on the Internet that -- with various
degrees of success -- translate EDTF dates into natural language phrases.
So let's develop a scheme that will eventually allow the computers to encode and decode a date
string and allow agents (computer and human) to present an appropriate human readable value.
Here's a (radical?) solution that might work for a "Level 3".
- For now, let IS0/EDTF continue to use 21-99 to represent seasons and other "Divisions of a Year" even if their meanings are ambiguous. However, a plan needs to be developed to first deprecate the continued use of the current 21-99 and eventually eliminate their use in a basic ISO/EDTF date. A mapping from the current "MM" definitions to any new scheme must be provided to assist current systems to convert their dates.
- Replace the codes (other than 1-12) in the "MM" segment with codes defined in an external source which I will refer to in this suggestion as a {calendar}. I've chosen to enclose {calendar} in braces since I'm not in love with "calendar" but couldn't come up with a better word -- ideas welcome.
- Develop a list of registered {calendars} that represent each of the meanings of the pesky "MM" segment of a date. These {calendars} could be defined with the appropriate: precision; start/end parameters; cultural considerations; etc.. These {calendar} definitions are really machinable "cheat sheets". They could even be developed to allow localization of a date for human readers.
- Define an initial set of {calendars} to include the needed (wanted?) divisions as mentioned throughout the seasons discussions -- e.g.:
- meteorological seasons (North, South)
- astronomic seasons (North, South)
- astrological periods (Western, Chinese, etc.)
- publishing seasons (when does this Winter begin? 2017? 2018?)
- weeks, months, sixths, quarters, thirds, halves -- tied to the beginning of the calendar year
- weeks, months. sixths, quarters, thirds, halves -- periods that mimic the ISO 8601 definit of the first week of the year (i.e., at least half of the period is within that year)
- Irish seasons ("spring begins on February 1, when they celebrate St Brigid's Day")
- And on and on and on ... ... ...
- Prefix each date with a registered {calendar} identifier followed by a separator (I suggest an exclamation point). Thus, if the "European Association of Left Handed Accountants" (EALHA) has registered a {calendar} with the identifier "157", then perhaps the 18th day of the 2nd quarter of 2018 might be "157!2018-72-18" and the folks who deal with EALHA dates could be presented by their systems with "2018-Q2-18" or "18th day of the 2nd quarter..." or whatever works for that community. All computer programs worldwide would be able to decode that "72" by observing the sanctioned and generally available {calendar} definition files/databases and express that date in whatever way fits their application and audience.
- "Division of Year" codes for each {calendar} would be assigned by the registration authority to eliminate overlaps and thus allow for predictable sorting when dates from different {calendars} are mingled. Yes, if we end up with lots of {calendars} with lots of "Division of Year" codes, we will need more then the two positions provided by today's "MM" segment. But that's why I have reserved 5 positions for Division codes in the internal format described below.
- I would even go so far as to say that ISO's "Y-M-D", "Y-D", "ISO Y-W" and "ISO Y-W-D" {calendars} could be represented for the computers by this scheme. That would allow the computers to avoid the sorting problems caused by the use of 3 positions of "Www" in an otherwise 2 position "MM" segment.
- The internal format to be used by the computers that I propose is aware that storage is "cheap" in the 21st century and would allow a common ASCII sort without the need to rearrange the date string. A date would be contained in a 32 bytes fixed length string. But don't worry, even a tiny 1 Gibibyte flash drive could store over 33,000,000 dates.
- 1 position for sign (+ or -)
- 16 positions for year (i.e., 9 quadrillion years -- that should keep most astrophysicists happy)
- 5 positions for Division codes (remember we may have a lot of {calendars})
-
- 3 positions for Days (365/366 is typical, but we could handle up to 999)
- 3 positions for the flags -- 1 flag each for each date segment (year, division, day)
- 0 -- exact accuracy & certain confidence
- 1 -- accuracy is approximate
- 2 -- confidence is uncertain
- 3 -- both approximate accuracy & uncertain confidence
- 4 positions for the {calendar} code
Note: This format is not meant for computers and not for human consumption -- the EDTF display for humans to read would remain unchanged with the flags interspersed in their current positions and with dashes separating segments of the date. The only addition would be the {calendar} prefix.
For humans simply reading a date, the applications could handle the arcane division codes for a particular {calendar} by providing: an explanatory natural language phrase; or a parenthetical explanation; or a footnote or endnote; or by displaying a "tooltip" when the user's mouse pointer hovers over the "confusing" code.
For humans entering a date, I expect that they would usually be given a computer interface that presented the user with decoded choices rather than the arcane codes for a particular {calendar}. For those unfortunate humans who need to record a date with "pencil & paper" (or a computer equivalent), they would be forced to rely on a "cheat sheet".
Many (most?) of the potential {calendars} represent an edge case. But I think the only corner case in the scheme is the poor guy with a pencil and a "cheat sheet".
Render unto Caesar the computer the things that are Caesar's the computer's; and to God the humans the things that are the human's.
Tony Benedetti