Hi all,
I was just thinking about how I would implement a CQL parser. There was
a corner case I was not sure of.
How are double quotes released? That is, ", as literal text in a query.
Is it using \?
ti="The \"best\" approach"
If so, when I am writing a parser, how to decide when to do \ processing
when parsing the CQL text, versus passing the \ over to the Z39.50 server.
For example,
ti="The \"best\" astericks (\*) to use"
For the above query, what 105 string should I send through?
The "best" astericks (\*) to use
or
The \"best\" astericks (\*) to use
CCL (just as a comparison) used two " in a row in a string to generate
one " so a different releasing mechanism would be used. But end users
don't really want to be worried about this detail I suspect. They would
just want one releasing mechanism.
So, a possible CQL parsing rule would be
when inside a quoted string, if you see '\' followed by '"' then
consume the '\' and keep the '"' for the literal text.
But this goes funny at times. If the query was
ti="The backslash character \\"
then is the quotes released? The \\ is meant to release the meanings of
the backslash (if you get what I mean). So I need to count the number
of '\'s while parsing. So I keep all characters, but if I see a releasing
'\' followed by '"', then in that case (and only that case) I discard the
'\'. Seems to work I guess...
Should \" be in 105 or not? What is the interpretation of \x or \p?
Is it illegal? (leaving possible \xff for example later). Or does \
always release the following character no matter what the character is?
Thanks,
Alan
--
Alan Kent (mailto:[log in to unmask], http://www.mds.rmit.edu.au/~ajk/)
Project: TeraText Technical Director (http://teratext.com.au) InQuirion Pty Ltd
Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001.
Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia.
Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098
|