As an implementor, I'd prefer the general case where backslash releases
whatever character follows it. But, for the user, that requires that
backslashes be released when used. Right now, only our few special
characters need to be released when being searched for literally and it's
probably best to leave it that way.
It you are passing the string, minus the bracketing double-quotes, to the
z39.50 server as a term with Truncation of 105, then you don't need to
consume the internal backslash-quotes. The z39.50 server will process them.
All you needed to know was to ignore double-quotes preceded by backslashes.
Ralph
> -----Original Message-----
> From: Alan Kent [mailto:[log in to unmask]]
> Sent: Thursday, September 05, 2002 11:19 PM
> To: [log in to unmask]
> Subject: CQL parsing question
>
>
> Hi all,
>
> I was just thinking about how I would implement a CQL parser.
> There was
> a corner case I was not sure of.
>
> How are double quotes released? That is, ", as literal text
> in a query.
> Is it using \?
>
> ti="The \"best\" approach"
>
> If so, when I am writing a parser, how to decide when to do \
> processing
> when parsing the CQL text, versus passing the \ over to the
> Z39.50 server.
> For example,
>
> ti="The \"best\" astericks (\*) to use"
>
> For the above query, what 105 string should I send through?
>
> The "best" astericks (\*) to use
>
> or
>
> The \"best\" astericks (\*) to use
>
> CCL (just as a comparison) used two " in a row in a string to generate
> one " so a different releasing mechanism would be used. But end users
> don't really want to be worried about this detail I suspect.
> They would
> just want one releasing mechanism.
>
> So, a possible CQL parsing rule would be
>
> when inside a quoted string, if you see '\' followed by '"' then
> consume the '\' and keep the '"' for the literal text.
>
> But this goes funny at times. If the query was
>
> ti="The backslash character \\"
>
> then is the quotes released? The \\ is meant to release the
> meanings of
> the backslash (if you get what I mean). So I need to count the number
> of '\'s while parsing. So I keep all characters, but if I see
> a releasing
> '\' followed by '"', then in that case (and only that case) I
> discard the
> '\'. Seems to work I guess...
>
> Should \" be in 105 or not? What is the interpretation of \x or \p?
> Is it illegal? (leaving possible \xff for example later). Or does \
> always release the following character no matter what the
> character is?
>
> Thanks,
> Alan
> --
> Alan Kent (mailto:[log in to unmask],
> http://www.mds.rmit.edu.au/~ajk/)
> Project: TeraText Technical Director (http://teratext.com.au)
> InQuirion Pty Ltd
> Postal: Multimedia Database Systems, RMIT, GPO Box 2476V,
> Melbourne 3001.
> Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton
> 3053, VIC Australia.
> Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61
> 3 9925 4098
>
|