LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  September 2002

ZNG September 2002

Subject:

Re: Where Are We Now?

From:

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Mon, 30 Sep 2002 11:01:42 +1000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (102 lines)

On Fri, Sep 27, 2002 at 11:01:07AM -0400, LeVan,Ralph wrote:
> I am content with case-insensitive index names.  I only suggested
> case-sensitivity because I'm in the middle of doing a bunch of XML stuff and
> that is all case-sensitive.

Does CQL allow Unicode in names? If so, there is a benefit in case
sensitive names. You just do an exact string comparison. No need to
decompose, map case, etc worrying about Unicode rules. Not really
stressed, just pointing out that with Unicode, case insensitive
checks is harder than good old ASCII.

> I believe we still need first-in-field and last-in-field indicators.  I
> still like the caret ('^') and dollar-sign as new special characters at the
> beginning of words to do that.

If these operators map on to attributes, I would prefer to keep them
out of the string literals. That is, I believe '^' will map on to
the first-in-field attribute, not be included in the actual term.
My preference is to keep the text in the string literals exactly what is
sent to the server - no local processing required. If this is desirable,
then can anchors be turned into modifiers too? lanc/leftanchored/first/start,
ranc/rightanchored/last/end, lranc, etc. Or else something outside the
string literal?

    ( dc.title = ^ "hello there" ^ )

> Ralph

Also, Robert Sanderson wrote:
> how is
>    indexset.index:token = "term"
> any less extensable than:
>    indexset.index token "term"

Its not a silly idea at all. In fact, if you remove the ':' character
and allow words or a set of symbols as modifiers, then '=' just becomes
the attribute 'equals'. Most servers would default to this anyway
(or has it changed in AA?) It means queries could be written as

    dc.title stem = "hi there"
or
    dc.title stem "hi there"

The attribute list would be a little different (the first would include
the 'equals' attribute value whereas the second would not), but it does
give query writers very good control over the attribute lists, while
giving some flexibility of what individuals think is 'good style' when
writing queries.

Someone else asked what does the following mean?

    dc.title relevance > "a"

I have no idea. But is it necessary in the grammar to disallow such
constructs? The idea is to have a general way to form attribute lists.
Servers can choose to reject/munge/tweak/ignore whatever combinations they
do not support (or whatever SRW mandates about such behaviour). I would
prefer to keep CQL open to new attributes etc (I keep thinking about GEO),
so I think its out of scope to work out what attribute combinations don't
make sense.


Grammar thought: going back to the idea of what an attribute list is
in the first place, why not include dc.title as a modifier?

query-term = "(" modifier* string-literal ")"
modifier = index-set "." access-point     dc.title, etc
         | named-modifier                 stem, relevance, etc
         | symbolic-modifier              =, >, <, etc

I will admit it does allow some bizare looking queries, but it
would also allow omitting the index name for example simply,
and it would also allow nested attribute lists to be supported
(you include multiple access points, defining a attribute path
as supported in AA for nested attributes).

A more conservative grammar would probably be better. Eg: force
zero or more dc.title etc to be first, followed by zero or more
textual modifiers, followed by an optional (zero or more?) symbolic
modifier, followed by the search text. Some power, but more consistent
look to queries.

query-term =
    "(" access-points* named-modifiers* symbolic-modifier? string-literal ")"

This stops people writing
    (stem=dc.title "hi")

Alan

ps: Putting parenthesis around everything is pretty ugly in my opinion
too, and quite unusual for such grammars - unless you like Lisp.
I have not read the web site recently relying on this list for
people to raise things that get changed, so I had not realized the
meeting had gone this way.
--
Alan Kent (mailto:[log in to unmask], http://www.mds.rmit.edu.au/~ajk/)
Project: TeraText Technical Director (http://teratext.com.au) InQuirion Pty Ltd
Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001.
Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia.
Phone: +61 3 9925 4114  Reception: +61 3 9925 4099  Fax: +61 3 9925 4098

Top of Message | Previous Page | Permalink

Advanced Options


Options

Error during command authentication.

Error - unable to initiate communication with LISTSERV (errno=111). The server is probably not started.

Log In

Log In

Get Password

Get Password


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager