Print

Print


Good afternoon to Zing Implementors!

I think we are here in a discussion about "Searching Use Cases".
You seem to propose that the user may wish to say
WHY (s)he searches and not HOW to search.

Anyone has done a list of the Use Cases for a search engine ?

I can see (at least):
* search on an identifier (very small set expected for the answer)
   The identifier can be an ISBN, a big piece of the title, etc.
* search through a linked object, this object being identified
   by an identifier like its name or its acronym (author, publisher, etc.)
* search on a "subject" (in its broadest sense). I suppose this is what 
you are
   proposing as a cql keyword search.
* search on a combination of "clues"

I take the opportunity to emphasize that word searches are sensitive to 
word boundaries.
This is particularly bad with proper names and languages like Dutch and 
German
where words may be glued together. A "name search", different from the 
"word search",
is often needed.

I have to organize searches for a dozen of medical doctors in Brussels 
Poison Centre.
I introduced the distinction between "name search" and "word search" but 
it is nearly
impossible to teach! Finally, we will probably make a "Simple search" 
that may try
different kind of searches to achieve a result within an acceptable size 
range
(In the 80's Stalton has proposed different tricks for this)
together with an "Advanced Search" where the different indexes and 
methodologies
(phonetic search for instance) are exposed to the user.

This to say that it is may be more than a "general word index search" 
that people
are asking for. They want a "Do what I mean" simple search completed 
with a more
structured search form for advanced searches. It is not a question of 
"index",
it is a question of purpose: many indexes and other automatic decisions 
may be involved
in a "What are you looking for?" form.

Wishing you a very nice day,

Christophe Dupriez

Mike Taylor a écrit :
> I pretty much agree with all of this.
>
> Robert Sanderson writes:
>  > Following some discussion with Mike Rylander, Ross Singer, Jenn Riley
>  > and Ryan Scherle off list, we came up with some thoughts following on
>  > from the discussion below.
>  > 
>  > There should be an index (called for example cql.keywords) that is a
>  > subset of the terms from the record, as determined by the server, with
>  > the intent of providing an access point that acts like a general keyword
>  > sort of search.  It might include points such as subject, title, author,
>  > description, date, but not metadata level access points such as
>  > the record's lastModificationDate or information about the current
>  > physical location of the item and so forth.
>  > 
>  > Although this is possible via cql.serverChoice (as the server can choose
>  > this sort of index) it's not guaranteed and the main tenet of the query
>  > language is to foster specificity whenever possible.
>  > 
>  > If a server gets the query:
>  > 
>  >   cql.serverChoice = 2006
>  > 
>  > it can appropriately choose to search only dc.date, when the user
>  > wants a very general search for 2006, but not so general as to extend to
>  > things like the date the record was entered into the database (the
>  > nature of 'generalness' being determined by the server) which would be
>  > cql.anywhere.
>  > 
>  > For cql.serverChoice, the server could also choose a random index (say
>  > an ISBN index) and get 0 hits for one query and then choose date and get
>  > 1000 hits next time.
>  > 
>  > However if the server gets the query:
>  > 
>  >   cql.keywords = 2006
>  > 
>  > It must choose the same index every time, it can't change based on the
>  > term, whereas with cql.serverChoice this is quite legitimate.
>  > 
>  > cql.serverChoice is potentially unable to be scanned, as it can be
>  > dynamically determined based on the term.  cql.keyword is able to be
>  > scanned, as although it consists of a server determined set of terms, it
>  > is a single, relatively persistent set.  cql.serverChoice will very
>  > commonly be pointed at cql.keywords (as per Ralph's BasicIndex).
>  > It's also not a default index, as you might default to a title search,
>  > rather than a general search.
>  > 
>  > Rob
>  > 
>  > (discussion from: irc://irc.freenode.org:6667/#code4lib)
>  > 
>  > 
>  > On Wed, 12 Jul 2006, Rob Sanderson wrote:
>  > 
>  > >On Tue, 2006-07-11 at 16:30 -0400, Ray Denenberg, Library of Congress
>  > >wrote:
>  > >> > 2) Add a bib.keywords index (or cql.keywords?), and note how it differs
>  > >> > from cql.anywhere. This is reasonable,  but may be confusing.
>  > >> It may be that the simplest approach is to simply define bib.keyword and let
>  > >> the server decide what it maps to (and explain it in explain).   I don't
>  > >> think this would be confusing.
>  > >
>  > >
>  > >I think this is related to the serverChoice distinctions previously
>  > >discussed.  In this case, the excludeOriginInfo is a data format
>  > >specific way to improve relevance of very general searches.
>  > >In other words: we don't have a particular index in mind, just search
>  > >what you think is right, but not OriginInfo.
>  > >
>  > >Which is one of the definitions for cql.serverChoice.
>  > >
>  > >
>  > >  http://listserv.loc.gov/cgi-bin/wa?A2=ind0501&L=ZNG&D=0&I=-3&P=4512
>  > >
>  > >
>  > >Follow on a couple of messages and Ralph says:
>  > >
>  > >"I produce an index for my databases named BasicIndex.  It is the index
>  > >that is searched when the user doesn't specify an index.  It is the
>  > >union, more or less, of a number of subject rich indexes, but is by no
>  > >means all the indexes.  It is the index that you get when you ask for
>  > >cql.serverchoice."
>  > >
>  > >If you have half an hour, read through the rest of the thread as well
>  > >which is enlightening, IMO.
>  > >
>  > >Therefore, I don't think that we need either bib.keywords or
>  > >bib.excludeOriginInfo.  bib.keywords has the same semantics, as far as I
>  > >understand exactly what is wanted, as cql.serverChoice (or
>  > >cql.anyIndexes below)
>  > >
>  > >Compare:
>  > >
>  > >cql.allFields    -- Search all fields in the data
>  > >cql.allIndexedFields - Search all indexed fields, exposed via CQL or not
>  > >cql.allIndexes   -- search in all indexes exposed via CQL (n)
>  > >cql.anyIndexes   -- search in any indexes you think appropriate (1..n)
>  > >cql.anyIndex     -- search in any single index you think appropriate (1)
>  > >cql.defaultIndex -- search in the index declared as default in Explain
>  > >
>  > >
>  > >Rob
>  > >
>  > >--
>  > >Dr Robert Sanderson
>  > >Dept of Computer Science, University of Liverpool
>  > >Home:     http://www.csc.liv.ac.uk/~azaroth/
>  > >Cheshire: http://www.cheshire3.org/
>  > >
>
>
>
>