LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


ZNG@C4VLPLISTSERV01.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  December 2004

ZNG December 2004

Subject:

Re: CQL implementation details

From:

Hedzer Westra <[log in to unmask]>

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Thu, 9 Dec 2004 18:22:44 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (192 lines)

Hi Mike & Rob (and potentially other interested human beings),

On Fri, 3 Dec 2004, Hedzer Westra wrote:
>>>> - how are words separated? The description hints at splitting on
(white)space only.
>> CQL tutorial, Section 2
[Rob]
> Which is a fine description for an introductory tutorial, but not
complete.  Especially as it lacks (as
> discussed) any mention of relation modifiers and has the version 1.0
proximity syntax. Implementers might use
> the tutorial as a guide, but in the end it's the specification which
matters.
Okay then, I'll just ignore what I read there then :-)

[Mike]
>>   [space] (separates words of a CQL expression)
>Yes; but this refers to the words that make up the entire query, not
those embedded within a term.  So this is
> talking about breaking up the query
>        dc.author any "kernighan ritchie"
>into the three tokens
>        index-name: dc.author
>        relation: any
>        term: "kernighan ritchie"
>and not at all about how that term "kernighan ritchie" is to be
interpreted.
Ah oww. Sorry 'bout that mixup.. My bad.

[Rob]
> However, it makes very little difference how the server splits a
string into words so long as it does it
> consistently. 
I'll make sure our server does that.

Here are Rob and Mike's replies to my implied modifier behaviour:

>> a. operator = with a multi-word term (word separation implementation 
>> dependent, should be described in the implementation profile) as well

>> as cql.all and cql.any operators -> default modifier is cql.word
[Rob]
>Yup.

>> b. operator cql.exact -> default modifier is cql.string.
[Mike]
> Well.  We're not talking about it in these terms.  To say "default
modifier" is misleading as there may
> legitimately be zero, one or more modifiers on a relation.  But, yet,
the term _structure_ implied by the 
> cql.exact relation is indeed "string".
Noted.

>> Question: does this refer to
>>    1. exact searching w.r.t. splitting of words (which would imply
that cql.word and cql.string are mutually exclusive)
[Mike]
> Yes, they are.  String vs. Words is a fundamental dichotomy that we've
thrashed out neatly on this list and
> which should be described in both the official documentation and the
tutorial.
[Rob]
>They are mutually exclusive.  A cql.string is an opaque set of
characters that the server should not try to 
>interpret.
Good, those two answers both say the same thing ;-)

>>    2. exact searching w.r.t. pattern matching (which would imply that
cql.masked and cql.string are mutually exclusive)
[Rob]
>I believe so. exact is treated as anchored at both ends, and may not
have any masking characters.
>=/cql.word   is adjacency.
>=/cql.string is exact.
[Mike]
> No, a masked string is just fine. (Why would we prohibit such a useful
thing?)
>       dc.title exact "the adventures of *"
> will find
>        The Adventures of Hulk
>        The Adventures of Baron Munchausen
>        The Adventures of the Famous Five
> but _not_
>        The Amazing Adventures of Captain Gladys Stoatpamphlet and her
Intrepid Spaniel Stig.
> because the extra word "amazing" breaks the "exact" condition.
Hmm, this is something different. I'm up for Rob's description if nobody
minds. This is coincidentally the way I've implemented it
already :-)

>> c. operator = with a single term and all other operators -> default
modifier is cql.masked
[Mike]
> The masked-vs.-unmasked dichotomy is orthogonal to string-vs.-words.
Good!
[Rob]
> Yes.

>> d. cql.masked implies ??: cql.word or cql.string or none? Maybe this 
>> is orthogonal, i.e., cql.masked can be
>>   supplied *together* with one of the other five (word, string, 
>> isoDate, number, uri) - assuming b.1. is true.
[Rob]
>I think that it only applies by default to word, but that should
probably be further discussed :)  For example, I would not want it to be
>applied to number, date, or uri.
Makes sense.

>> But then you'd also need to be able to specify cql.unmasked or
something to disable pattern matching.
[Mike]
>Yes; there should be a cql.unmasked relation modifier.
[Rob]
>You can escape the pattern characters, or define a new modifier that
overrides the masking -- for example you might want foo.regexp as a 
>different set of masking rules.
If I understand it correctly Mike suggests to extend the CQL context
set, and Rob suggests to define it in our own context set. Let's have
the SRW 1.2 people decide upon this!

>> e. only one of word, string, isoDate, number and uri can be set at
the same time for one searchClause
[Rob]
>Yes.
[Mike]
>Correct, because these particular modifiers all represent alternative
points along the same axis.
Same answer: good!

>> - why is sorting defined on XPaths?
[Rob]
> Mostly because it was an easy, existing specification to use to
specify a path to some data in a structured
> document. This doesn't mean that the server has to actually -do-
XPath, just that it should accept them and
> respond appropriately.  For example, if you can sort by exact title,
author and date, then you might hard wire
> /record/title, /record/creator and /record/date to these sort
routines. Then you could just respond with 
> unsupported tag path to all other requests.
I've done this indeed. Too bad there isn't a separate spec for sorting
on context set indexes.

>> - is there an Open Source SRU/SRW tester (like 
>> http://oai.dlib.vt.edu/cgi-bin/Explorer/oai2.0/testoai for OAI) or a
[Rob]>Not yet, but it's on my list of things to do.
See below:
[Mike]
>>> I see that Marc has already answered your questions about open
source clients.
>> Did he?
>Yes.  He recommended the fine YAZ command-line client ("yaz-client")
for SRW, and the web-browser of your choice, >or wget, for SRU.
Very good! I guess his e-mail got lost in my spam filter, I retrieved
the msg from the archive and got CQLJava which contained a set of XSLs
which turn IE into a SRU browser. Needed a few tweaks, but works great!
I didn't implemented SRW, so there was no need for yaz.
BTW I didn't try creating a unit testing program yet, but I expect it to
be quite simple; Marc sent me a Unix shell script that will do that. I
didn't test it on my Cygwin yet.

>> - the ZeeRex documentation is a bit concise on configInfo. What 
>> settings exactly are 'setting', 'default' and 'supports'?
[Rob]
> setting:  Something which cannot be changed.  You might have a setting
of 'maximumRecords' -- the maximum
>   records you can retrieve at once.
> default:  Something which can be changed but has a default value.  For
example 'retrievalSchema' -- the default
>   schema you'll get your records in unless you specify one.
> supports:  Some feature of the protocol which the server supports.
For example sorting, proximity of the scan
>   operation.
Yes I understand that from the description, I just hoped for a
definitive list saying which configInfo @type needs which element. Now
I've just guessed those, I hope SRU client implementors made the same
guesses. BTW: if this is implementation dependent I would have chosen to
set those three values as attribute, not as element name. But that's
another discussion..

Best regards,

Hedzer Westra, Systems Developer

Adlib | Information Systems
Reactorweg 291
3542 AD Utrecht
Postbus 1436
3600 BK Maarssen
tel: +31-30-241 1885
www: http://www.adlibsoft.com

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2017
October 2016
July 2016
August 2014
February 2014
December 2013
November 2013
October 2013
February 2013
January 2013
October 2012
August 2012
April 2012
January 2012
October 2011
May 2011
April 2011
November 2010
October 2010
September 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
October 2009
September 2009
August 2009
July 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager