On Mon, Jun 28, 2004 at 03:01:32PM +0100, Rob Sanderson wrote:
> I've got a gateway to a Z39.50 server which returns characters which I
> can't serialize to XML and/or successfully deserialize, for example the
> ascii escape character.
Yes, XML does not permit characters from 0-31 (except for tab, return,
and line-feed). It does not matter how you write it ( , etc),
its just not legal.
If its SRW, theoretically I guess the type for terms could be changed
from string to base64 or hexstring (whatever they are called). (Puke.)
XER "solved" the problem by introducing empty tags for the 29 characters
you are not allowed to use in XML. Eg: <esc/>. This won't work with SOAP
toolkits.
Best solution I can suggest is find a Unicode character which indicates
an unprintable Unicode character (e.g. empty square box like you see in
Windows sometimes? Or even just a period (".")) and replace unprintable
characters with that character. Or expand to "<escape>". (Er, that is,
"<escape>".)
Yes, it looses information. The only alternative is not to use XML string
types, but rather use binary (base64, hexstring, etc) types, which are
typically a pain to use in XSLT. (Or include a display term with "."
instead of unprintable things, and the real term as a hex string if
clients really need it...)
Alan
|