Print

Print


----- Original Message ----- 
From: "Edward C. Zimmermann" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Tuesday, November 29, 2005 10:24 AM
Subject: Re: Proximity search


> Quoting Mike Taylor <[log in to unmask]>:
>
>>
>> >
>> > I don't sematically see "same element" (in the same leaf container)
>> > as proximity.
>>
>> Hmm.  I am trying to think of a polite way to say "then you are
>> mistaken", but I can't find one.  :-)
>
> We are not here to be polite but to create good systems..
>
>> > Proximity is distance.. Within X characters.. Within X
>> > words.. within some metric. Same element is NOT a metric.
>>
>> If you really want to push this point, you'll have to overturn an
>> ANSI/NISO standard going back a full decade and ratified by ISO.  See
>> http://www.loc.gov/z3950/agency/markup/09.html#3.7.2
>>
>
> I am, as you know, quite familar with it. It was in many ways wrong but
> reflected models of throught widespread back decades ago when we had 
> enough
> of a time doing proximity with characters and words and most of us had
> little to no support of paragraph, section chapter. It might have seemed
> to make sense to go from "chapter" to an abstract "element" but it does
> not. Byte streams, characters, words, lines etc. have a concept of unit,
> distance and order.
>
> To be a proximity there is (and specified here too) a scalar (distance)
> and a relation.
>
> In a structured document such as
>
> <person>
>  <name> Edward Zimermann  </name>
>  <address>   </address>
>  <network>
>    <email> [log in to unmask]</email>
>  </network>
> </company>
> <company>
>  <primary>
>    <name> Nonmonotic labs </name>
>   .
>

Actually, I did the first draft of the proximity spec for Z39.50 and I based 
it on our working production systems, in which we supported proximity 
searching over both textual data and MARC data.  We had "same element" as 
distance=0, unit=element (although I think we used the term "field").  We 
also supported sections, chapters, footnotes, captions in our e-book project 
and all of the proximity oeprations over them.  So, the spec was based on 
real-world experience.  I don't think we were the only ones doing this, 
based on my recollection of the discussions we had on Z39.50 (anyone 
remember AT&T (Lucent) Bob's electronic books?)

-markh