LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  May 2002

ZNG May 2002

Subject:

Re: revised Bath/CQL searches

From:

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Wed, 22 May 2002 11:20:56 +1000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (178 lines)

On Tue, May 21, 2002 at 07:51:53AM -0400, LeVan,Ralph wrote:
> > 1. bath.authorWord (can be used with relation(?) and
> > truncation operators)
> >
> >     Attribute Type    Attribute Values    Attribute Names
> >     --------------    ----------------    ---------------
> >     Use (1)           1003                author
> >     Relation (2)      3                   equals
> >     Position (3)      3                   any position in field
> >     Structure (4)     2                   word
> >     Truncation (5)    100                 do not truncate
> >     Completeness (6)  1                   incomplete subfield
>
> I understand that this is much more in line with the explicit intent of the
> Bath Profile folks, but I don't like it.  Specifically, I don't want the
> truncation rules to change from index to index.
>
> Besides, I don't believe the intent of the Bath Profile was to prohibit
> truncation, just to define a type of search that did not require it.
>
> Ralph

I agree with you 100%. I want to support right truncation on bath.authorWord.
My mail must not have been clear enough. I will expand with motivation etc.

I want CQL to support the Bath profile.
I want CQL to support the Bath profile without any special rules
built into the definition of CQL that is just for Bath.
I don't want CQL to have Bib-1 knowledge (I want like it to be generic).

To achive this, a bath.authorWord definition as follows is not sufficient:

    Attribute Type    Attribute Values    Attribute Names
    --------------    ----------------    ---------------
    Use (1)           1003                author
    Position (3)      3                   any position in field
    Structure (4)     2                   word
    Completeness (6)  1                   incomplete subfield

(Note: "Bib-1" should have really be included as a column in the table
above because the type/value pairs should really be attrset/type/value
triples. I have added a new column in all the following tables.)

With this definition, CQL needs to know Bib-1 and/or Bath to insert
in truncation and relation type/values. Otherwise the query

    bath.authorWord=smith

will not be a valid Bath query because it is missing the relation and
truncation attribute values. Since I am trying to avoid special Bib-1
and Bath knowledge, I dislike CQL having to have special knowledge
that it must insert missing attribute values. So instead, I proposed
the full definition by used for index names.

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Use (1)           1003                author
    Bib-1           Relation (2)      3                   equals
    Bib-1           Position (3)      3                   any position in field
    Bib-1           Structure (4)     2                   word
    Bib-1           Truncation (5)    100                 do not truncate
    Bib-1           Completeness (6)  1                   incomplete subfield

If you specify a query such as

    bath.authorWord=smith

then to me the '=' symbol means nothing (its just a separator between
the index name and the term to search on). So just grab the full attribute
list above and search on it. This is Bath conformant, and no special
knowledge is required in CQL.

So how to introduce truncation? To avoid Bib-1 knowledge (because there
are truncation attributes defined in other attribute sets such as GEO),
I proposed CQL have the concept of index names and operator names.
Index names are the bath.authorWord etc names. Operator names are
symbolic names that CQL uses to map concepts CQL implements (such as
'?' meaning truncation, '>' meaning greater-than) onto attribute lists.

So I proposed operator definitions to have attribute lists (just like
index names) such as

    Operator: Right Truncation

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Truncation (5)    1                   right-truncation

So a CQL query such as

    bath.authorWord=smith?

is turned into an attribute list for the word "smith" by first taking
the attribute list for "bath.authorWord", then adding/overlaying the
attribute list for "operator: right truncation". The add/overlay rules
are if the same attribute-set/type has a value already, replace it
with the new value from the operator. Otherwise append a new triple
to the end of the attribute list. For the above definitions, you end
up with

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Use (1)           1003                author
    Bib-1           Relation (2)      3                   equals
    Bib-1           Position (3)      3                   any position in field
    Bib-1           Structure (4)     2                   word
    Bib-1           Truncation (5)    1                   right-truncation
    Bib-1           Completeness (6)  1                   incomplete subfield

As a comparison, *if* we define "dc.title" to be just a USE attribute
(that is, the other types are not defined - we are not doing the Bath
thing of mandating all type/values are specified):

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Use (1)           4                   title

Then the query

    dc.title=smith?

would map to the attribute list

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Use (1)           4                   title
    Bib-1           Truncation (5)    1                   right-truncation

(Note: I am not proposing what dc.title should be, just using it as
an example of how operators either add or replace attribute values to
form the full attribute list - in this example a new type/value is
added because it was not there before).

The same thing is done for other operators such as '>' etc. In my previous
mail, I proposed that '=' NOT be mapped onto an operator. Instead, it
is just syntactic sugar to separate the index name from the term.
In practice this is not a problem. Most systems default to 'equals'
if that attribute value is not specified. So I consider '=' to mean
nothing special at all. However, other symbols '>', '<', '>=' etc
DO have special meaning. They map onto operator names (greater-than etc).
The 'greater-than' operator would be defined as:

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Relation (2)      5                   greater-than

So a query such as

    dc.title>smith

would map on to

    Attribute Set   Attribute Type    Attribute Values    Attribute Names
    -------------   --------------    ----------------    ---------------
    Bib-1           Use (1)           4                   title
    Bib-1           Relation (2)      5                   greater-than

So in my previous mail, you have to look at both the operator definitions
*and* the index definitions for a database (not just the index definitions).


Is this scheme perfect? No. I can come up with weird semantics easily.
But I don't think there is a perfect scheme because Z39.50 itself is
not perfect. But it seems like a sensible sort of compromise. It avoids
Bib-1 knowledge (because all such knowledge is built into the index and
operator definitions - I could replace Bib-1 with GEO in all the above
tables and it would just work). It avoids Bath rules (CQL does not have
to extend index attribute lists to make sure all type/values are included
as required by Bath). And its extensible to support other operators that
may be introduced (for example GEO region-overlaps etc operators).
(I am not trying to propose a syntax for other operators (such as overlaps)
here - I want to defer that orthogonal discussion till later - but I think
other operators are important to support somehow.)

I hope this clears up any confusion.

Alan

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2017
October 2016
July 2016
August 2014
February 2014
December 2013
November 2013
October 2013
February 2013
January 2013
October 2012
August 2012
April 2012
January 2012
October 2011
May 2011
April 2011
November 2010
October 2010
September 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
October 2009
September 2009
August 2009
July 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager