> any character other than whitespace
How is "whitespace" defined in this context? Unicode contains tens of
"cadidate whitespace" characters, that is characters that may very well be
considered "whitespace characters". In Unicode 6.0, there are 18 different
characters defined as "Space Separator", for example. Furthermore, using
characters such as "Paragraph Separator", "Line Feed", "Tab", "Form Feed",
"Carriage Return", "NULL" or "Line Separator" (all these are NOT
considered "Space Separators" by Unicode) may probably be problematic in
many cases. I would not recommend either to allow field separator, record
separator and many other control characters within qualifying strings.
> there is no need to worry about characters
> in the qualifying string conflicting with
> specially defined characters in the main
> part of the dateTime string.
That is true! But what about conflicts with:
(1) schemes to represent qualifying strings in future versions of EDTF?
(2) characters (not) allowed within the file formats where EDTF is stored?
Of course, it is (nearly) always possible to escape characters to be able
to include them in XML attributes or in JSON, for example, but this is not
Forward compatibility is also something I consider important.
I consider that limiting the definition of "string" with a regex allowing
only safe characters would probably avoid problems.
The regex I suggested is very trivial but quite efficient to achieve these goals: