On Thu, 22 Aug 2002, Robert Sanderson wrote:
> > >The problem I'm having is that I no longer remember how we thought we
> > >could distinguish between "first-characters-in-field" and
> > >"first-words-in-field" searches in Bath. In other words, what would
> > >Were we specifying the "firstWords" search with proximity?
>
> > My meeting notes say:
> > Left anchored is assumed.
> > Exact match is assumed
> > Unanchored begins with a question mark
>
> > My notes do say that our prose would relate different SRW/U queries to Bath
> > searches. We did not want specific Bath elements in the query syntax. I
> > agree with Mike that we should not load the index names.
>
> I agree with a somewhat rambling caveat:
>
> Index Names are supposed to represent the attribute combinations in
> Z39.50. So we can say 'titleWord' not (1=4, 3=3, 4=6)
Actually, 'titleWord' would represent (1=4, 3=3, 4=2, and 6=1).
But you're correct, we're loading these names with some attribute
values.
> So we're already loading the difference between structure 1, completeness
> 1 and structure 3 completeness 6 into 'title' vs 'titleWord'
>
> I think this is an acceptable level of semantics in index names, so long
> as there is a recommendation to use one particular naming scheme (foo,
> fooWord) Obviously this can be ignored, but so can attribute
> combinations, it just means that the searches will produce possibly
> unexpected results due to lousy configuration.
Or, searches other than "foo", "fooWord" would generate diagnostics.
>
> titleFirstWordsIncludingLeadingArticles is a search, not an index.
> 'First' or any other description of the location of the term should be
> part of the query language, not the name of the index. And as we can
> express it with either proximity or truncation, there's no need for it to
> be in the index.
There's no need for it to be in the index _name_, but don't you think
it's much clearer if we state that left anchored is assumed for the
"foo" indexes?
>
> This brings me to my next questions:
>
> 1.
> Why even fooWord? We can express it with an unanchored search with
> spaces.
> eg: foo="? term *"
>
> I don't think that this is sensible, but it's the logical conclusion and
> there needs to be an answer to give to it.
>
> The index name describes what is in the index. fooWord contains the
> individual words from 'foo', which implies that the server best knows how
> to extract a 'word' from its data with respect to which normalisation and
> extraction routines to use. The search shouldn't have to know which
> normalisation/extraction routines are used, so there needs to be a 'word'
> index. For first words, it does know the extraction routine -- take the
> first words in the field. Right?
I think there will be server differences even with the left-anchored
searches. Some may not include initial articles; some will.
>
> 2.
> If someone searches with:
> titleWord="search term"
>
> The agreement was to fail it, if my memory serves? As there's no single
> word which matches 'search term'?
> The search should have been titleWord="search" and titleWord="term" ?
Agree.
>
> Rob
>
> --
> ,'/:. Rob Sanderson ([log in to unmask])
> ,'-/::::. http://www.o-r-g.org/~azaroth/
> ,'--/::(@)::. Special Collections and Archives, extension 3142
> ,'---/::::::::::. Twin Cathedrals: telnet: liverpool.o-r-g.org 7777
> ____/:::::::::::::. WWW: http://liverpool.o-r-g.org:8000/
> I L L U M I N A T I
Larry
------------------------------------------------------------
Larry E. Dixson Internet: [log in to unmask]
Network Development and MARC
Standards Office, LM639
Library of Congress Telephone: (202) 707-5807
Washington, D.C. 20540-4402 Fax: (202) 707-0115
|