LISTSERV mailing list manager LISTSERV 16.0

Help for ZNG Archives


ZNG Archives

ZNG Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ZNG Home

ZNG Home

ZNG  December 2004

ZNG December 2004

Subject:

Re: Adlib Base profile

From:

Mike Taylor <[log in to unmask]>

Reply-To:

Z39.50 Next-Generation Initiative

Date:

Tue, 14 Dec 2004 14:46:35 GMT

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (180 lines)

> Date: Tue, 14 Dec 2004 14:12:40 +0100
> From: Hedzer Westra <[log in to unmask]>
>
> attached is the preliminary version of the Adlib Base Profile.
> [...]
> I'd appreciate it if you could take some time to read it and check
> if I didn't do or write anything stupid...

OK.  Thanks for running this past us.

> Also, I'n not a native speaker, nor is our technical writer, so if
> made some errors in my English you can correct me if you want to..

Your English is superb; so is your writer's.

--

> - The meta-index cql.anywhere searches all indexes defined in the
>   Adlib database at once. It does not search all indexes in all
>   context sets, as the CQL context set suggests. This might be a slow
>   search if there are a lot of indexes.

It is at best inadvisable, and probably just wrong, to _re_define the
meaning of an existing index like this -- especially such a core one.
If cql.anywhere doesn't meet your needs, it would be better to define
your own index that does (or ask to have it added to the CQL set if
you think it's of general interest).

> - The adlib.record meta-index searches the whole record. The
>   operator doesn't matter.  This is a slow search since no index can
>   be used.

Perhaps we should consider adding cql.record for whole-record
searching (where supported).

You can't really say "the operator doesn't matter" as this is
overriding established semantics of CQL and the CQL context set.  It
would be much better to say "the operator must be '=': all others will
be rejected".

(And by the way, it is conventional in CQL to talk of "relations"
rather then "operators".  Unless you have a compelling reason, you
should probably stick to this convention.)

> - The Adlib thesaurus operators 'adlib.generic', 'adlib.broader',
>   'adlib.narrower', 'adlib.related', 'adlib.topterm' and
>   'adlib.parents' do thesaurus-enabled searches. These only work
>   correctly on indexes with thesaurus links defined. Otherwise, they
>   fall back on '=' searching.

Are these relations or relation modifiers?  If you don't already have
this nailed down, I would recommend the latter, as they are all
refinements on the general relation of equality.

There really should be a thesaurus-use context set defined outside of
Adlib, for use in this and other profiles (or the relevant elements
should be added to the existing Zthes context set).  We actually
started this process a month or two back, but got sidetracks -- or
maybe mired in excess complexity.

Depending on the urgency of your Adlib work, you might try to restart
that process and use the resulting "official" thesaurus-expansion
support.  Otherwise, if you need to push on with the Adlib-specific
approach, I hope you will change this in version x.y of your profile,
when the official version comes out, as this will promote
interoperability between Adlib and other SRU implementations.

> - The 'encloses' and 'within' operators are implemented using the
>   Adlib WHEN operator. Some examples:
>   'term encloses "2000 2004"' translates to 'term >= 2000 WHEN term
>   <= 2004'

Nope -- "encloses" is the converse of "within", so
        term encloses "2000 2004"
translates to
        term <= 2000 WHEN term >= 2004

>   'term within "2001 2005"' translates to 'term > 2001 WHEN term <
>   2005'.

My reading of the CQL context set indicates that this relation is
inclusive of endpoints, so you should translate to
        term >= 2001 WHEN term <= 2005

> - there are two types of modifiers: data type modifiers and pattern
>   modifiers [...]

This whole section belongs in the CQL context-set document.

>   The pattern modifiers are:
>    cql.masked
>    cql.unmasked (not defined in CQL context set)

We should fix that!

>   Note that the CQL context set is not required by the SRW Base
>   Profile!

That's not really true, as the CQL context set provides some of the
key elements used in pretty much CQL queries, e.g. the meaning of all
the relations.  Probably the base profile should make this explicit.

>   The modifiers cql.word and cql.string can not relate directly to
>   Adlib term or word matching because this is defined per index by
>   the user; in Adlib each index can be either word or term
>   indexed. If required, a field can be indexed by term as well as by
>   word.

[I don't understand this fully, I think because it assumes you know
something about Adlib.  If Adlib's "term" searching similar to what we
mean by "string"?]

>   These two indexes can be be reflected using two separate CQL
>   indexes. It is not possible to use modifiers to switch from one to
>   the other.

Why not?  It seems an eminently sensible way of expressing the
difference.

>   Adlib interprets terms in the following manner:
>   + operator 'exact': implied modifiers are cql.unmasked [...]

No, we all agreed that "exact" does _not_ imply unmasked.

> + operator '=': implied modifiers are cql.masked and either cql.word
>     or cql.string, depending on the index type. This cannot be seen
>     in the explain information but must be described in a profile.

This is _not_ what "=" means in CQL.  It means that the term is
word-structured, irrespective of the index being searched, unless
overridden by a relation modifier.

>   + operators 'any' and 'all': implied operators are cql.word and cql.masked.
>     The words are combined using OR (for 'any') or AND (for 'all').

Yup.  This was always the intent and should probably be explicit in
the CQL context-set document.

>   + adlib.record meta-index: implied operators are cql.string and
>     cql.unmasked.

There is nothing in CQL that allows you to infer different
term-structure and masking semantics from an index name.

>   Implied modifier cql.word means:
>    Words are split and then re-combined using the Adlib separators and concatenators rule.
>    Separator characters are: [];,!@()|{}<>? carriagereturn newline space tab
>    Concatenator characters are: `-=\./~#$%^&_+:"'*
>    Please note that the CQL context set says nothing about how words are to be split.

... and that therefore what you specify here is a perfectly good
refinement of what the CQL context set says.

>   This implied behaviour will remain intact in future versions, even
>   if modifiers will be supported then.

Aha!  Finally, I spot a tiny, tiny error in the English :-)  That
sentence should say "... even if modifiers ARE supported then".

>   - operator 'exact' does not imply cql.string, since cql.string or
>     cql.word is index dependent on Adlib.

The correct way to handle this is to have a single CQL index be mapped
to either one of two different underlying Adlib indexes, dependent on
whether string or word structure is used.

I think that's everything.  Despite my having complained about so many
things, I think this is really nice work, and the document is very
clear about what it's saying.

 _/|_    _______________________________________________________________
/o ) \/  Mike Taylor  <[log in to unmask]>  http://www.miketaylor.org.uk
)_v__/\  "In the Sixties people took acid to make the world weird.
         Now the world is weird, people take Prozac to make it normal"
         -- Damon Albarn.

--
Listen to free demos of soundtrack music for film, TV and radio
        http://www.pipedreaming.org.uk/soundtrack/

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2017
October 2016
July 2016
August 2014
February 2014
December 2013
November 2013
October 2013
February 2013
January 2013
October 2012
August 2012
April 2012
January 2012
October 2011
May 2011
April 2011
November 2010
October 2010
September 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
October 2009
September 2009
August 2009
July 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager