Print

Print


Ed, although I take Peter's point and probably agree  - that "container" is probably a more semantically appropriate term than "element" for its  proposed use - I don't recall that anyone even brought up that point in the discussion, there was an entirely different reason altogether: that "element" is too suggestive of xml, and "container" is more neutral. And though I also take Mikes point that "element" can mean whatever you want it to, when someone says "name and date in the same element" that suggests xml to too many people. Your shoe example is more complicated that anything we are trying to represent in CQL. There won't be containters within containters. 
--Ray

-----Original Message-----
From: SRU (Search and Retrieve Via URL) Implementors [mailto:[log in to unmask]] On Behalf Of Edward C. Zimmermann
Sent: Thursday, July 08, 2010 4:15 AM
To: [log in to unmask]
Subject: Re: CQL example for prox/unit=

On Tue, 6 Jul 2010 16:52:26 +0000, Peter Noerr wrote
> I have only one question:
> 
> Why on earth would one prefer to define an "element" as a "thing which 
> contains other things", and "container" as a "thing which

I would suggest that to ask about information "in the same container" if "container" means element makes no sense.. 

Elements can be nested.. What's
  <parent><child><grandchild> .. </grandchild></child></parent> ?

Is the value of grandchild also contained in child and parent? 


And to the other extreme we have empty elements.. <empty></empty> (or <empty /> )

In SGML/XML an element can have many or no children.

We can talk about atomic elements as an element that has no children but may or may not have content--- but we, of course, are only interested in elements with content..

In SGML/XML we have "content" in the value of the element and its attribute
values:
  <element attribute="attribute value">Element value</element>

Now.. We should be interested in information structures even more abstract than SGML/XML (e.g. overlaps etc.) and so things can get even more twisted.. 

How would I ask the question that gives me the answer "Edward Zimmermann"
below:

 <person><name>Edward Zimmermann</name><address><street>Leopoldstrasse
</street><city>Munich</city></address> </person>

In the same name instance? But what if I don't know that the "container" is called "name"?

In above are not Zimmermann and Leopoldstrasse contained in the same person element instance?

How should I distinguish these two? Zimmermann is contained in name and Leopoldstrasse is contained in street, street is a child of address and both address and name are siblings of person.. 

These are the kind of questions we ask all the time.. right? Give me the Zimmermanns on Leopoldstrasse in Munich. Give me cookbook's written by Schiller.. Tell me which books were edited by Johann Wolfgang Goethe?

Sure if we know the questions ahead of time we can structure (or transform) a database with the structure to make such queries easy (and flat).. But then the next questions which don't fit that structure? 

And when we start to talk about heterogeneous data.. that is when the user is not even 100% on the structure they can't ask for "street" or "person" but they'd like to still ask the question.. By talking about anonymous containers.. etc. Its possible..

> contains no other things" when this is *exactly* the reverse of their 
> normal English usage? Is there a perverse desire here to make things 
> (pun intended) as confusing as possible?

Not really.. Think about a shoe shop.. They sell pairs of shoes stored in boxes.. and that box of shoes is stored on a shelf and that shelf has an address. Is the container the room where these shelves are? The shelf? The level in the shelf? Or the shoe box?

In this model.. The elements are the shop, the room, the shelves, the specific shelf, the box.. we have a path of elements to define an address to find the container which contains a specific pair of shoes..  When we talk here of container we mean the box (which contains a pair of shoes).. even if the box is contained in the shelf, in the room, in the shop, in the house, ... 

> 
> Peter Noerr
> 
> > -----Original Message-----
> > From: SRU (Search and Retrieve Via URL) Implementors
[mailto:[log in to unmask]] On Behalf Of Edward
> > C. Zimmermann
> > Sent: Saturday, July 03, 2010 4:07 PM
> > To: [log in to unmask]
> > Subject: Re: CQL example for prox/unit=
> > 
> > Ray, I'm actually quite pleased. I feel almost as if someone has 
> > been listening! There are, however, just a few points still that 
> > need to be
voiced..
> > 
> > 
> > On Sat, 3 Jul 2010 15:40:52 +0100, Mike Taylor wrote
> > > Surely "container" here means EXACTLY the same thing as "element", 
> > > i.e. whatever you want it to mean?
> > >
> > 
> > In SGML containers are called elements. When we want to distinguish 
> > between an element and the containers it contains we can talk about 
> > container thus they are, in semantic use, not always the same. See 
> > below.. (my argument is that they have things backwards).
> > 
> > 
> > On Fri, 2 Jul 2010 17:55:02 -0400, Ray Denenberg wrote
> > > Peter, that's an example of structured proximity searching and 
> > > that has been abandoned in the OASIS CQL spec, the most recent 
> > > draft of which is at 
> > > http://www.loc.gov/standards/sru/oasis/current/cql.doc
> > >  which I recommend that you look at, because it  clarifies (and
> > > simplifies) proximity.
> > 
> > On Sat, 3 Jul 2010 09:31:40 -0400, Ray Denenberg wrote
> > > Just want to add this.  The problem isn't so much in saying "find 
> > > A and B in the same element", the problem is when the distance is 
> > > greater than zero as in "find A and B separated by two elements".
> > >  In the OASIS spec, the notion of element is discarded and the 
> > > more abstract notion of "container" is defined, and you can "find 
> > > A and B
> > 
> > I would argue that the predicate "container" should be used as a 
> > name for a specific kind of element, namely atomic containers, viz. 
> > containers with no sub-containers (indexes). An element here would 
> > be a container when it has no sub-elements. In a typical flat format 
> > such as common to Z applications the fields are elements and containers..
> > 
> > > in the same container"  (with no notion of finding A and B 
> > > separated by, say, two containers).  --Ray
> > 
> > Correct. Since in XML and many other models there is no order one 
> > can't talk about distance. The only distance one can talk about is 
> > bytes according to how the information was marked-up but only if the information was marked-up.
> > 
> > The doc: "A container is a structure containing one or more indexes.  
> > For example the server may support a container whose name is
[UTF-8?]‘author’ that
> > contains indexes [UTF-8?]‘name’ and [UTF-8?]‘date’.  In that 
> > case
the server would support a
> > query  (see example) to find  an author with a specific name and 
> > date.  (This is contrasted with a Boolean query which may return 
> > undesired results because they have multiple authors, some of which 
> > have the desired name but the wrong date and others the specified 
> > date but the wrong name.) The server should list supported 
> > containers in its Explain file, and for each container, the indexes that it contains."
> > 
> > Exactly. One has here named sub-paths   author/name  and author/date but also
> > the anonymous path "in the same container". I, however, would swap 
> > around the semantics and view "container" as the lowest node and element any .. example:
> > 
> > <book>
> >  <title>Re: CQL example for prox/unit=</title>
> >   <author>
> >     <name>Edward C. Zimmermann</name>
> >     <date>11 Feb 2000</date>
> >    </author>
> >   <edition>2nd</edition>
> >   <date>3 July 2010</date>
> > </book>
> > 
> > The element book contains title, author, edition, date etc.
> > In the "same container" would however mean exclusively the lowest node (leaf).
> > Containers should have no sub-elements
> >   Edward and Zimmermann are in the same container (book/author/name>)
> >   Edward and CQL are not in the same container even if, per your 
> > semantics the container author as two sub-elements.. whence the same 
> > container would be the same "author" container.
> > 
> > In my IB engine the semantics for "in the same container" only 
> > applies to the lowest.. e.g. bookauthor/name instance or book/author/date instance..
> > If I want to search for Edward and SQL in the same book then I 
> > search for it by name WITHIN:book  or explicitly in path 
> > WITHIN:\book  (I use \ for / as / is used to denote fields). That is: elements (and paths).
> > 
> > In my above example:   2000, Zimmermann are in the same author.. they are also
> > within the same book..  CQL and Zimmermann are within the same book 
> > but not within the same author. etc...  2010 and July are in the 
> > same date. July and 2000 are not in the same date but are in the same book.. etc.
> > 
> > http://www.ibu.de/RelationalHierarchicalIR
> > http://www.ibu.de/IB_Query_Fields
> > 
> > --
> > 
> > Edward C. Zimmermann, NONMONOTONIC LAB Basis Systeme netzwerk, 
> > Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net
> > Umsatz-St-ID: DE130492967


--

Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net