> > Date: Mon, 28 Jun 2004 18:09:45 +0200
> > From: Adam Dickmeiss <[log in to unmask]>
> > http://www.w3.org/TR/2004/REC-xml-20040204/#charsets
> > The XML spec guys really did exclude most chars in the
> 0-0x01f range.
> > I wonder why.
Well, here is someone claiming that it is a good thing:
> Regarding Rob's actual question, the correct answer is and
> must be that XML is just a broken transport.
Which is probably why it has been fixed in XML 1.1 (remember how we
fixed some stuff when we moved from SRW 1.0 to 1.1?) Some of the WSDL
2.0/XML Schema discussions are on whether to have new string types (for
XML 1.0 valid string and XML 1.1 valid strings) etc.
> HOWEVER, we clearly also need to be able to send
> base64-encoding CQL queries,
Well no - we just need a relational modifier to indicate that the *term*
is base64 encoded (which we need anyway for sending binary data e.g. my
mimeEncoded modifier at
http://www.ceridwen.com/srw/music-contextset.html) rather than encode
the whole query.
> Holy moley. All that to fix a bug that was deliberately
> written into the XML specification. Unbelievable. Unbelievable.
Do we really want to go to that extreme - these are somewhat sweeping
changes to SRW and will break a lot of the type-checking/validation we
currently have in the schemas. I'd argue not since
A) XML 1.1. already fixes this - XML 1.1 is currently in a final draft
so it may be 6 months or so before it becomes part of the Web Services
stack but any fix we make now will be rendered obsolete when XML 1.1 is
B) having control characters in the scan seems to me to be such a
pathological (albeit real) case, that taking such a radical knife to SRW
seems out of proportion.
Until XML 1.1 et al stabilises we could adopt the practice of the SQLX
guys (ISO/IEC 9075-14:2007) and use _x001b_ to represent this character?