LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  February 2002

ZNG February 2002

Subject:

Betr.: CQL - what do people want?

From:

Theo van Veen <[log in to unmask]>

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Thu, 14 Feb 2002 11:19:39 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (111 lines)

Some general remarks after reading the other discussions on CQL.

1.
Searchfields being identified by a two-part identifier, where the first part identifies the scope of the second part, contain the potential danger that there will be many scopes and it will become very difficult to make sure that "one search fits all". I have a very strong preference for forcing search terms into a single scope (a default attribute set).
Maybe I see it wrong. Are there examples where it makes sense to distinghuish between title and dc.title in searching?

2.
We will use SRU to query multiple databases and I would like to send the a query unchanged to all these targets. That is the main reason for standardisation and the main reason for us to use SRU.

3.
A very important scenario will be: users with a webbrowser that supports XSL/XML and an HTML page in which the user will enter one query that will be send to multiple servers. The results will be catched in multiple browser windows that do the XSL-transformation but this is monitored by the main search window. We have this scenario working but the main problem is now indeed that different (external) targets require a different query.
NB. This scenario is called the "personal portal" because it does not require any central portal for querying different SRU targets and every user can have his own portal (being just an HTML page and one or more XSL stylesheets) as long as this portal speaks SRU.

4.
The idea that in SRU users must type a query as it is being sent to the targets is not correct. In the html search page all the required processing can be done to convert a user query to CQL. I would like however that CQL and what the user types in are the same or as much alike as possible.

5.
The preferred language for index types is English rather than numbers. We use lots (>100) of index types among which are the most conventional types as title, auther, keyword, subject, TSBN and ISSN. Some of the other index types are specific for specific databases and it does not hurt when an index type that does not exist in all databases is sent to multiple targets. The others just do not give any hits. This seems to be a very attractive approach for most people (users, developers database owners) that are involved in the develeopment of websites for different projects/databases.

6.
I prefer queries like:

title:power and fame (boolean)
title:"power and fame" (phrase)
creator:xyz and subject:standards
author:smith (it is up to the target to convert his to creator:smith)

I think it makes sense to use the Dublin Core fields as index types (so we do not need the dc.prefix). This can be complemented with namespaces from relevant Application Profiles (like the Library Application Profile).
Everybody is free to use exotic index types in as well the query as in his databases but it is obvious that it is in everyone's own interest to conform to a single standard for the conventional index types.

Theo




>>> [log in to unmask] 13-02-02 08:02 >>>
I snuck some questions about CQL into a separate mail, but got no response
yet. So I thought I would try a more direct route. What are the goals for
CQL in terms of syntax?

Is a goal to make it reasonably human readable?

Is a goal to make it very internationally friendly (eg: by using numbers
in preference to symbolic identifiers that mean something in english)?

Do people want to be able to use multiple attribute sets in one query?

Do people want to be able to take one query and issue it against
multiple collections without change?

Is a goal to have a direct mapping to Z39.50 constructs?

My personal preferences (influenced no doubt by being English speaking)
is to make it human readable instead of numbers. The field names for
searching on would be symbolic names (not full text) and would relate
I guess to metadata standardss.

Note, there is a CQL page up already under ZiNG, but its pretty terse.
And I would prefer a few things to be different (as always! :-)

To make things concrete, here are some queries:

    dc.Title = "Power and Fame"
    dc.Title = ("Power" AND "Fame")
    dc.Contributor = "LOC" AND dc.Subject = "Standards"
    dc.Contributor = "LOC" AND agls.Identifier = "xyzzy"
    bib1.Author, dc.Contributor = "Smith"

Basically, I suggest:
- Queries are Unicode text (UTF-8 or whatever).
- All text to be searched to always be inside quotes. This allows new
  reserved words to be added later without breaking old queries.
- All reserved words to be upper case only (debatable).
- Fields to be searched to be identified by a two-part identifier
  where the first part identifies the scope for the second part.
  Eg: dc.title.

What I don't know is how to define a set of scope names (are they
attribute sets? Or just a logical grouping for names? Eg: dublin core
attributes are defined in the Bib-1 attribute set at present)

I am not sure if I would want to use the current exact Z39.50
attribute sets etc for mapping onto CQL field names. (opinions?)

Then how to manage the population of field-set names? Should there
be a central CQL registry of such names? If it can change per server,
then reusing a query against multiple servers seems doomed.
Should sites be able to define their own new, local sets without
going to the global registry? Instead of 'dc.Title', should it be
a URL? That is, dublin core XML namespace URI + DC element name?
Or should queries be CQL text plus a set of definitions for mapping
"dc" to "Dublin Core URI" etc.

The pattern match chars don't seem to follow any existing standards.
(To be more precise, it mixes several existing standards). I would
stick either to CCL (which is the # and ?) and drop '*'. My rationale
is I want to map it to Z39.50 easily. Z39.50 has got a CCL regex
attribute already. I don't mind using a different one - but I think
its important to be able to map the patterns through to some existing
syntax in Z39.50.

Enough to spark off some conversation?

Alan

--
Alan Kent (mailto:[log in to unmask], http://www.mds.rmit.edu.au)
Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001.
Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia.
Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2017
October 2016
July 2016
August 2014
February 2014
December 2013
November 2013
October 2013
February 2013
January 2013
October 2012
August 2012
April 2012
January 2012
October 2011
May 2011
April 2011
November 2010
October 2010
September 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
October 2009
September 2009
August 2009
July 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager