LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  November 2002

ZNG November 2002

Subject:

CQL Simple Names (IMPT!)

From:

Robert Sanderson <[log in to unmask]>

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Tue, 12 Nov 2002 18:46:52 +0000

Content-Type:

TEXT/PLAIN

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (162 lines)

As I unfortunately predicted, an extremely thorny problem was discovered
yesterday.  Although this mail is very long, I wanted to cover all of the
ground that we did yesterday so as to hopefully not go over it again.
The end result is a more functional CQL specification and a more
interoperable SRW.


The current situation:

Record schemas and index sets are refered to by simple name, described
in the Explain record.  The record should be retrieved, parsed and
then further queries use this information to interact with the server.

The issue:

There is no session for which the Explain can be declared valid for.
Even by the time that the client reconnects to perform its first
query, the Explain may be invalid.  While this particular situation
may be unlikely, it is significantly more likely that the Explain
information will change over the course of a single user's interaction
with the database.  The only resolution using the current system is to
fetch the Explain record repeatedly -- the more often it is fetched
the more likely it is to be valid.  This is simply not supportable.

'Solutions'

The following solutions were discussed:

* Have a guaranteed valid until time for Explain records.
  BUT this is just more 'session' support which has already been rejected
  (ala authenticationToken) Also servers would never be able to change if
  they have constant connections.

* Retrieve the last modified time for the Explain record.
  BUT this doesn't solve the underlying issue, it just reduces the amount
  of data transfered, making it easier to ask repeatedly, but you still
  need to do it.

* Maintain a registry of names for all schemas and indexsets.
  BUT it's a lot of work and people will ignore it.  First in gets the best
  short names.  So we end up with SRW-Rob-IndexSets-myIndexSet.  Or
  'SRW cybersquatters'.

* Send the URI for the indexSet/Schema in place.
  BUT ugly for Schemas and intolerable for IndexSets. Bad for SRU.

* Send the mapping between URI and simple name in a separate parameter
  BUT CQL needs to be able to stand alone.  This ties it to SRW again and
  already that is looking to be an unacceptable solution.  Mike reports
  that people are asking for CQL support in non SRW focused products /already/.
  The query simply -has- to somehow stand by itself and not rely on other
  information in the request.


Final Solution:

The only way to be sure is to send the URIs, not simple names.  This has
been rejected in the past for length of URL reasons for SRU, but this is
very unlikely to ever be an issue in practice. Much more unlikely than the
problem which it solves.

Schemas can be sent directly and should always be used making the simple
names for schemas redundant.  This occurs in the recordSchema request
parameter, the schema parameter of sort and in the schema field of the
returned record.

Indexsets on the other hand need to be typable.  Our solution for this is
to send the mapping used to the server, rather than hoping that the
server's mapping hasn't changed since it was last fetched.  This cannot be
done in a separate parameter, so we need to change X/CQL.

This is a relatively simple change -- we remove the simple names from the
Explain record for record schemas and everywhere that they were used we
now use the full URI identifier.  The change for CQL was designed (with
much agonising) to be completely backwards compatible.  All currently
valid CQL queries will be compatable after this change.

CQL Specifics:

The change for CQL is to allow a mapping to be sent before any cql-query
or searchClause.  The mapping applies to anything contained within that
searchClause or boolean triple.

After trying many possibilities, we arrived at the following syntax which
we believe to be unambiguous and not require multiple token lookahead.
        '>' [identifier '='] term
identifier is the simple name and term is the URI to which it is assigned.
If identifier = is omited then it gives a default index set URI.
This can be repeated to give multiple name definitions.

For example:

> dc="http://www.dublincore.org/" > b="http://www.loc.gov/.../bath/"
(dc.title = "fish" and b.author = "^Smith, J*")

which is equivalent to

( > dc="http://www.dublincore.org" dc.title = "fish" and
  > b = "http://...bath/" b.author="^Smith, J*" )

Other examples:

( > "http://www.dublincore.org" title = "fish" )

( > b="http://.../bath/" > "http://www.dublincore.org"
    (b.author = "smith" and title = "fish")
)

These index set definitions are optional.  If you're sending the search to
a database that you are confident has not changed its configuration, then
you can still use the current method.

We tried MANY other variants, but this was the neatest with the least
impact on the current specification.

In XCQL this translates as an optional element 'prefixes' at the beginning
of either searchClause or triple, which contains a sequence of 1 or more
'prefix' elements, each of which contains a name/identifier map.

<triple>
  <prefixes>
    <prefix>
      <name>dc</name>
      <identifier>http://www.dublincore.org/</identifier>
    </prefix>
    <prefix>
      <name>bath</name>
      <identifier>http://www.loc.gov/.../bath/</identifier>
    </prefix>
  </prefixes
  <boolean><value>and</value></boolean>
  <searchClause>
     ...
  </searchClause>
  <searchClause>
     ...
  </searchClause>
</triple>


Other side effects:

This makes broadcast searches possible. You can send the same query to all
servers and ask for DC records back.

With no centrally maintained lists, the uptake will likely be greater as
it doesn't rely on a single point.  Communities can define their own
record schemas and indexsets and not have trouble when the bibliographic
community starts asking the homewares community for bath.author as opposed
to bath.manufacterer. This is also less work for Ray and index set/record
schema authors.  A centrally maintained list of record schemas would be
impossible.

Rob
--
      ,'/:.          Rob Sanderson ([log in to unmask])
    ,'-/::::.        http://www.o-r-g.org/~azaroth/
  ,'--/::(@)::.      Special Collections and Archives, extension 3142
,'---/::::::::::.    Twin Cathedrals:  telnet: liverpool.o-r-g.org 7777
____/:::::::::::::.              WWW:  http://liverpool.o-r-g.org:8000/
I L L U M I N A T I

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2017
October 2016
July 2016
August 2014
February 2014
December 2013
November 2013
October 2013
February 2013
January 2013
October 2012
August 2012
April 2012
January 2012
October 2011
May 2011
April 2011
November 2010
October 2010
September 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
October 2009
September 2009
August 2009
July 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager