Firt Word in Field in a word based index means that we need anchoring
characters. This means we do not need word masking as it's already
masked because of the semantics of the index. This would need to be
dc.titleAdjacentWords in the latest revision of indexes.
I suggest ^ and $ as recognisable anchoring characters.
At first I agreed with Rob K about the confusion between semantics and
user interface, but we already do that (eg title vs titleWord) If it
doesn't make sense in Chinese, then there's no need to list that index in
your set of defined indexes -- there's no need to support every dc index.
It allows servers and clients to chose the appropriate operator for
multiple word terms, as opposed to hoping for the best.
Rob
On Tue, 24 Sep 2002, Ray Denenberg wrote:
> I'm going to re-write the dc index document based
> on what I think best represents the collective
> thinking. So I need to try to focus on specific
> questions. Right now I have the following:
>
> 1. Do we want "anchor" characters in the cql
> syntax, or is "anchored" (left and right) the
> default?
>
> 2. If anchored is the default, do we want mask on
> word boundaries? (expansion/interpretation 1 and
> 3)
>
> 3. If so, do we need the word masking character
> "|" that's been proposed or can I withdraw that
> proposal?
>
> 4.AdjacentWords, AllTheseWords, AnyOfTheseWords
> are format/structure values. Shouldn't they be
> expansion/interpretation?
>
> --Ray
>
--
,'/:. Rob Sanderson ([log in to unmask])
,'-/::::. http://www.o-r-g.org/~azaroth/
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. Twin Cathedrals: telnet: liverpool.o-r-g.org 7777
____/:::::::::::::. WWW: http://liverpool.o-r-g.org:8000/
I L L U M I N A T I
|