On Thu, 8 Jul 2010 07:52:29 -0400, Ray Denenberg, Library of Congress wrote > Ed, although I take Peter's point and probably agree - that > "container" is probably a more semantically appropriate term than > "element" for its proposed use - I don't recall that anyone even > brought up that point in the discussion, there was an entirely > different reason altogether: that "element" is too suggestive of xml, > and "container" is more neutral. And though I also take Mikes point Container is not neutral. Nor is element. Element is about membership. Container is about possession. > that "element" can mean whatever you want it to, when someone says > "name and date in the same element" that suggests xml to too many > people. Your shoe example is more complicated that anything we are > trying to represent in CQL. There won't be containters within > containters. --Ray Of course there are containers within containers--- unless you adopt MY semantics for container as a specific kind of element, e.g. with no children--- and we've had them for YEARS and YEARS (thinking CAPS, GILS, DIF, Dublin Core, ISO 19115 etc.)... We don't, I think, want to try flatten the world as we did back in Z (forced by OIDs to avoid horrible acrobatics)--- where we even had "problems" with repeating elements. With elements/containers like word, line, sentence, paragraph, section, chapter, ... you have containers that contain containers--- even "worse" you have overlap since a line can contain multiple sentences and a sentence can span one or more lines. And we want to search these.. and with all kinds of structure don't we? As soon as fields have sub-fields.. we have elements within elements.. containers within containers like Matryoshka dolls. We can talk about paths or ways of describing elements, either generically, "name" or specific "person/name" (the name of a person).. But we also want to talk about terms being in the same ..... (insert the word "container" or ??? for the leafs without sub-elements/containers or ..). > > -----Original Message----- > From: SRU (Search and Retrieve Via URL) Implementors > [mailto:[log in to unmask]] On Behalf Of Edward C. Zimmermann > Sent: Thursday, July 08, 2010 4:15 AM To: [log in to unmask] > Subject: Re: CQL example for prox/unit= > > On Tue, 6 Jul 2010 16:52:26 +0000, Peter Noerr wrote > > I have only one question: > > > > Why on earth would one prefer to define an "element" as a "thing which > > contains other things", and "container" as a "thing which > > I would suggest that to ask about information "in the same > container" if "container" means element makes no sense.. > > Elements can be nested.. What's > <parent><child><grandchild> .. </grandchild></child></parent> ? > > Is the value of grandchild also contained in child and parent? > > And to the other extreme we have empty elements.. <empty></empty> > (or <empty /> ) > > In SGML/XML an element can have many or no children. > > We can talk about atomic elements as an element that has no children > but may or may not have content--- but we, of course, are only > interested in elements with content.. > > In SGML/XML we have "content" in the value of the element and its attribute > values: > <element attribute="attribute value">Element value</element> > > Now.. We should be interested in information structures even more > abstract than SGML/XML (e.g. overlaps etc.) and so things can get > even more twisted.. > > How would I ask the question that gives me the answer "Edward Zimmermann" > below: > > <person><name>Edward > Zimmermann</name><address><street>Leopoldstrasse > </street><city>Munich</city></address> </person> > > In the same name instance? But what if I don't know that the > "container" is called "name"? > > In above are not Zimmermann and Leopoldstrasse contained in the same > person element instance? > > How should I distinguish these two? Zimmermann is contained in name > and Leopoldstrasse is contained in street, street is a child of > address and both address and name are siblings of person.. > > These are the kind of questions we ask all the time.. right? Give me > the Zimmermanns on Leopoldstrasse in Munich. Give me cookbook's > written by Schiller.. Tell me which books were edited by Johann > Wolfgang Goethe? > > Sure if we know the questions ahead of time we can structure (or > transform) a database with the structure to make such queries easy > (and flat).. But then the next questions which don't fit that > structure? > > And when we start to talk about heterogeneous data.. that is when > the user is not even 100% on the structure they can't ask for > "street" or "person" but they'd like to still ask the question.. By > talking about anonymous containers.. etc. Its possible.. > > > contains no other things" when this is *exactly* the reverse of their > > normal English usage? Is there a perverse desire here to make things > > (pun intended) as confusing as possible? > > Not really.. Think about a shoe shop.. They sell pairs of shoes > stored in boxes.. and that box of shoes is stored on a shelf and > that shelf has an address. Is the container the room where these > shelves are? The shelf? The level in the shelf? Or the shoe box? > > In this model.. The elements are the shop, the room, the shelves, > the specific shelf, the box.. we have a path of elements to define > an address to find the container which contains a specific pair of > shoes.. When we talk here of container we mean the box (which > contains a pair of shoes).. even if the box is contained in the > shelf, in the room, in the shop, in the house, ... > > > > > Peter Noerr > > > > > -----Original Message----- > > > From: SRU (Search and Retrieve Via URL) Implementors > [mailto:[log in to unmask]] On Behalf Of Edward > > > C. Zimmermann > > > Sent: Saturday, July 03, 2010 4:07 PM > > > To: [log in to unmask] > > > Subject: Re: CQL example for prox/unit= > > > > > > Ray, I'm actually quite pleased. I feel almost as if someone has > > > been listening! There are, however, just a few points still that > > > need to be > voiced.. > > > > > > > > > On Sat, 3 Jul 2010 15:40:52 +0100, Mike Taylor wrote > > > > Surely "container" here means EXACTLY the same thing as "element", > > > > i.e. whatever you want it to mean? > > > > > > > > > > In SGML containers are called elements. When we want to distinguish > > > between an element and the containers it contains we can talk about > > > container thus they are, in semantic use, not always the same. See > > > below.. (my argument is that they have things backwards). > > > > > > > > > On Fri, 2 Jul 2010 17:55:02 -0400, Ray Denenberg wrote > > > > Peter, that's an example of structured proximity searching and > > > > that has been abandoned in the OASIS CQL spec, the most recent > > > > draft of which is at > > > > http://www.loc.gov/standards/sru/oasis/current/cql.doc > > > > which I recommend that you look at, because it clarifies (and > > > > simplifies) proximity. > > > > > > On Sat, 3 Jul 2010 09:31:40 -0400, Ray Denenberg wrote > > > > Just want to add this. The problem isn't so much in saying "find > > > > A and B in the same element", the problem is when the distance is > > > > greater than zero as in "find A and B separated by two elements". > > > > In the OASIS spec, the notion of element is discarded and the > > > > more abstract notion of "container" is defined, and you can "find > > > > A and B > > > > > > I would argue that the predicate "container" should be used as a > > > name for a specific kind of element, namely atomic containers, viz. > > > containers with no sub-containers (indexes). An element here would > > > be a container when it has no sub-elements. In a typical flat format > > > such as common to Z applications the fields are elements and containers.. > > > > > > > in the same container" (with no notion of finding A and B > > > > separated by, say, two containers). --Ray > > > > > > Correct. Since in XML and many other models there is no order one > > > can't talk about distance. The only distance one can talk about is > > > bytes according to how the information was marked-up but only if the information was marked-up. > > > > > > The doc: "A container is a structure containing one or more indexes. > > > For example the server may support a container whose name is > [UTF-8?][UTF-8?]‘author’ that > > > contains indexes [UTF-8?][UTF-8?]‘name’ and [UTF-8?][UTF-8?]‘date’. In that > > > case > the server would support a > > > query (see example) to find an author with a specific name and > > > date. (This is contrasted with a Boolean query which may return > > > undesired results because they have multiple authors, some of which > > > have the desired name but the wrong date and others the specified > > > date but the wrong name.) The server should list supported > > > containers in its Explain file, and for each container, the indexes that it contains." > > > > > > Exactly. One has here named sub-paths author/name and author/date but also > > > the anonymous path "in the same container". I, however, would swap > > > around the semantics and view "container" as the lowest node and element any .. example: > > > > > > <book> > > > <title>Re: CQL example for prox/unit=</title> > > > <author> > > > <name>Edward C. Zimmermann</name> > > > <date>11 Feb 2000</date> > > > </author> > > > <edition>2nd</edition> > > > <date>3 July 2010</date> > > > </book> > > > > > > The element book contains title, author, edition, date etc. > > > In the "same container" would however mean exclusively the lowest node (leaf). > > > Containers should have no sub-elements > > > Edward and Zimmermann are in the same container (book/author/name>) > > > Edward and CQL are not in the same container even if, per your > > > semantics the container author as two sub-elements.. whence the same > > > container would be the same "author" container. > > > > > > In my IB engine the semantics for "in the same container" only > > > applies to the lowest.. e.g. bookauthor/name instance or book/author/date instance.. > > > If I want to search for Edward and SQL in the same book then I > > > search for it by name WITHIN:book or explicitly in path > > > WITHIN:\book (I use \ for / as / is used to denote fields). That is: elements (and paths). > > > > > > In my above example: 2000, Zimmermann are in the same author.. they are also > > > within the same book.. CQL and Zimmermann are within the same book > > > but not within the same author. etc... 2010 and July are in the > > > same date. July and 2000 are not in the same date but are in the same book.. etc. > > > > > > http://www.ibu.de/RelationalHierarchicalIR > > > http://www.ibu.de/IB_Query_Fields > > > > > > -- > > > > > > Edward C. Zimmermann, NONMONOTONIC LAB Basis Systeme netzwerk, > > > Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net > > > Umsatz-St-ID: DE130492967 > > -- > > Edward C. Zimmermann, NONMONOTONIC LAB > Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net -- Edward C. Zimmermann, NONMONOTONIC LAB Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts Umsatz-St-ID: DE130492967