## [email protected]

#### View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Proportional Font

Subject:

More ZINGing on the simple model for element of retrieval

From:

SRU (Search and Retrieve Via URL) Implementors

Date:

Fri, 8 Dec 2006 11:46:37 +0100

Content-Type:

text/plain

Parts/Attachments:

 text/plain (92 lines)
 Partitioning: Let me see if I can make the point a bit easier. In searching for a query there are the records matching the specification and the coordinates of the match within the records in the document space. Each of these "hit" coordinates is somewhere within the document tree. With a named path one can identify the ancestor in the tree of a hit. (using the Shakespeare XML again)   ...      ...              ...                    LADY MACBETH             Out, damned spot! out, I say!--One: two: why,             then, 'tis time to do't.--Hell is murky!--Fie, my             lord, fie! a soldier, and afeard? What need we             fear who knows it, when none can call our power to             account?--Yet who would have thought the old man             to have had so much blood in him.            ... The "spot" in line is in "LINE" (PLAY\ACT\SCENE\SPEECH\LINE). The line is "Out, damned spot! out, I say!--One: two: why,". Its in a speech the speech is the above fragment. Its in a scene, its content is the set of all speeches in that scene including this one where "out" and "spot" was said in the same line. etc. etc. etc. At any point going up the tree from one of my hit coordinates I may also ask for the content of any of the descendants of a named ancestor, to the extent that it has any children. LINE, for instance, is a final leaf and has none. The ancestor SPEECH of my hit "spot" in the "out" and "spot" in the same LINE is the above speech. Its LINE descendants are the LINES above. Not just the one line but all 6 of them. The SPEAKER descendant is the content of "SPEAKER" and that's LADY MACBETH. We may also map these coordinates to offsets of storage of the document as a serial object on the disk, viz. the "document order". This mapping lets us order the list of these coordinates as per the document order to give us next and previous among the siblings. By selecting a path for ancestor and a path of a descendant of that ancestor of a hit (node) and with an order of hits (previous/next) we have, I suggest, an easy to express and sufficient model of search segmentation for the concept of unit of retrieval for "result". Since we no longer have the restriction of OIDs for element retrieval we can encode this within the named element retrieval as string. In RSS search if I'm interested in items (which I am) and not feeds as my element of retrieval then if I want to get an Ancestor ITEM (as ..\channel\item) and to follow the link to the story I want that ancestor's LINK and TITLE descendant elements. For a brief summary I maybe want the DESCRIPTION element of the particular ITEM etc. If my search query matched say some text in the TITLE element of a CHANNEL and not any item then I don't have an ITEM for that hit. But I have, should the user request it, a channel. This makes sense since I may ask questions like? What news stories has BBC currently in their top feed in contrast to what news stories are currently talking about BBC. The questions are different (why we have contextual search) but also the intended element of retrieval is different. Its so simple to express and yet delivers, I think, exactly what we want, resp. need. Counter-examples anyone? -- -- Edward C. Zimmermann, Basis Systeme netzwerk, Munich Office Leo (R&D):    Leopoldstrasse 53-55, D-80802 Munich,    Federal Republic of Germany Telephone: Voice:= +49 (89) 385-47074 Corp.Fax:= +49 (89) 692-8150  Nomadic (SMS/MMS/Fax):= +49 (176) 100-360-55 Alt.Mobile:= +49 (179) 205-0539 http://www.nonmonotonic.net