Print

Print


On Thu, 8 Jul 2010 07:52:29 -0400, Ray Denenberg, Library of Congress wrote
> Ed, although I take Peter's point and probably agree  - that 
> "container" is probably a more semantically appropriate term than 
> "element" for its  proposed use - I don't recall that anyone even 
> brought up that point in the discussion, there was an entirely 
> different reason altogether: that "element" is too suggestive of xml,
>  and "container" is more neutral. And though I also take Mikes point 

Container is not neutral. Nor is element. Element is about membership.
Container is about possession.

> that "element" can mean whatever you want it to, when someone says 
> "name and date in the same element" that suggests xml to too many 
> people. Your shoe example is more complicated that anything we are 
> trying to represent in CQL. There won't be containters within 
> containters. --Ray

Of course there are containers within containers--- unless you adopt MY
semantics for container as a specific kind of element, e.g. with no
children--- and we've had them for YEARS and YEARS (thinking CAPS, GILS, DIF,
Dublin Core, ISO 19115 etc.)... We don't, I think, want to try flatten the
world as we did back in Z (forced by OIDs to avoid horrible acrobatics)---
where we even had "problems" with repeating elements.

With elements/containers like word, line, sentence, paragraph, section,
chapter, ... you have containers that contain containers--- even "worse" you
have overlap since a line can contain multiple sentences and a sentence can
span one or more lines.

And we want to search these.. and with all kinds of structure don't we?

As soon as fields have sub-fields.. we have elements within elements..
containers within containers like Matryoshka dolls.

We can talk about paths or ways of describing elements, either generically,
"name" or specific "person/name" (the name of a person).. But we also want to
talk about terms being in the same ..... (insert the word "container" or ???
for the leafs without sub-elements/containers or ..).


> 
> -----Original Message-----
> From: SRU (Search and Retrieve Via URL) Implementors 
> [mailto:[log in to unmask]] On Behalf Of Edward C. Zimmermann 
> Sent: Thursday, July 08, 2010 4:15 AM To: [log in to unmask] 
> Subject: Re: CQL example for prox/unit=
> 
> On Tue, 6 Jul 2010 16:52:26 +0000, Peter Noerr wrote
> > I have only one question:
> > 
> > Why on earth would one prefer to define an "element" as a "thing which 
> > contains other things", and "container" as a "thing which
> 
> I would suggest that to ask about information "in the same 
> container" if "container" means element makes no sense..
> 
> Elements can be nested.. What's
>   <parent><child><grandchild> .. </grandchild></child></parent> ?
> 
> Is the value of grandchild also contained in child and parent?
> 
> And to the other extreme we have empty elements.. <empty></empty> 
> (or <empty /> )
> 
> In SGML/XML an element can have many or no children.
> 
> We can talk about atomic elements as an element that has no children 
> but may or may not have content--- but we, of course, are only 
> interested in elements with content..
> 
> In SGML/XML we have "content" in the value of the element and its attribute
> values:
>   <element attribute="attribute value">Element value</element>
> 
> Now.. We should be interested in information structures even more 
> abstract than SGML/XML (e.g. overlaps etc.) and so things can get 
> even more twisted..
> 
> How would I ask the question that gives me the answer "Edward Zimmermann"
> below:
> 
>  <person><name>Edward 
> Zimmermann</name><address><street>Leopoldstrasse 
> </street><city>Munich</city></address> </person>
> 
> In the same name instance? But what if I don't know that the 
> "container" is called "name"?
> 
> In above are not Zimmermann and Leopoldstrasse contained in the same 
> person element instance?
> 
> How should I distinguish these two? Zimmermann is contained in name 
> and Leopoldstrasse is contained in street, street is a child of 
> address and both address and name are siblings of person..
> 
> These are the kind of questions we ask all the time.. right? Give me 
> the Zimmermanns on Leopoldstrasse in Munich. Give me cookbook's 
> written by Schiller.. Tell me which books were edited by Johann 
> Wolfgang Goethe?
> 
> Sure if we know the questions ahead of time we can structure (or 
> transform) a database with the structure to make such queries easy 
> (and flat).. But then the next questions which don't fit that 
> structure?
> 
> And when we start to talk about heterogeneous data.. that is when 
> the user is not even 100% on the structure they can't ask for 
> "street" or "person" but they'd like to still ask the question.. By 
> talking about anonymous containers.. etc. Its possible..
> 
> > contains no other things" when this is *exactly* the reverse of their 
> > normal English usage? Is there a perverse desire here to make things 
> > (pun intended) as confusing as possible?
> 
> Not really.. Think about a shoe shop.. They sell pairs of shoes 
> stored in boxes.. and that box of shoes is stored on a shelf and 
> that shelf has an address. Is the container the room where these 
> shelves are? The shelf? The level in the shelf? Or the shoe box?
> 
> In this model.. The elements are the shop, the room, the shelves,
>  the specific shelf, the box.. we have a path of elements to define 
> an address to find the container which contains a specific pair of 
> shoes..  When we talk here of container we mean the box (which 
> contains a pair of shoes).. even if the box is contained in the 
> shelf, in the room, in the shop, in the house, ...
> 
> > 
> > Peter Noerr
> > 
> > > -----Original Message-----
> > > From: SRU (Search and Retrieve Via URL) Implementors
> [mailto:[log in to unmask]] On Behalf Of Edward
> > > C. Zimmermann
> > > Sent: Saturday, July 03, 2010 4:07 PM
> > > To: [log in to unmask]
> > > Subject: Re: CQL example for prox/unit=
> > > 
> > > Ray, I'm actually quite pleased. I feel almost as if someone has 
> > > been listening! There are, however, just a few points still that 
> > > need to be
> voiced..
> > > 
> > > 
> > > On Sat, 3 Jul 2010 15:40:52 +0100, Mike Taylor wrote
> > > > Surely "container" here means EXACTLY the same thing as "element", 
> > > > i.e. whatever you want it to mean?
> > > >
> > > 
> > > In SGML containers are called elements. When we want to distinguish 
> > > between an element and the containers it contains we can talk about 
> > > container thus they are, in semantic use, not always the same. See 
> > > below.. (my argument is that they have things backwards).
> > > 
> > > 
> > > On Fri, 2 Jul 2010 17:55:02 -0400, Ray Denenberg wrote
> > > > Peter, that's an example of structured proximity searching and 
> > > > that has been abandoned in the OASIS CQL spec, the most recent 
> > > > draft of which is at 
> > > > http://www.loc.gov/standards/sru/oasis/current/cql.doc
> > > >  which I recommend that you look at, because it  clarifies (and
> > > > simplifies) proximity.
> > > 
> > > On Sat, 3 Jul 2010 09:31:40 -0400, Ray Denenberg wrote
> > > > Just want to add this.  The problem isn't so much in saying "find 
> > > > A and B in the same element", the problem is when the distance is 
> > > > greater than zero as in "find A and B separated by two elements".
> > > >  In the OASIS spec, the notion of element is discarded and the 
> > > > more abstract notion of "container" is defined, and you can "find 
> > > > A and B
> > > 
> > > I would argue that the predicate "container" should be used as a 
> > > name for a specific kind of element, namely atomic containers, viz. 
> > > containers with no sub-containers (indexes). An element here would 
> > > be a container when it has no sub-elements. In a typical flat format 
> > > such as common to Z applications the fields are elements and containers..
> > > 
> > > > in the same container"  (with no notion of finding A and B 
> > > > separated by, say, two containers).  --Ray
> > > 
> > > Correct. Since in XML and many other models there is no order one 
> > > can't talk about distance. The only distance one can talk about is 
> > > bytes according to how the information was marked-up but only if the
information was marked-up.
> > > 
> > > The doc: "A container is a structure containing one or more indexes.  
> > > For example the server may support a container whose name is
> [UTF-8?][UTF-8?]‘author’ that
> > > contains indexes [UTF-8?][UTF-8?]‘name’ and
[UTF-8?][UTF-8?]‘date’.  In that 
> > > case
> the server would support a
> > > query  (see example) to find  an author with a specific name and 
> > > date.  (This is contrasted with a Boolean query which may return 
> > > undesired results because they have multiple authors, some of which 
> > > have the desired name but the wrong date and others the specified 
> > > date but the wrong name.) The server should list supported 
> > > containers in its Explain file, and for each container, the indexes that
it contains."
> > > 
> > > Exactly. One has here named sub-paths   author/name  and author/date but
also
> > > the anonymous path "in the same container". I, however, would swap 
> > > around the semantics and view "container" as the lowest node and element
any .. example:
> > > 
> > > <book>
> > >  <title>Re: CQL example for prox/unit=</title>
> > >   <author>
> > >     <name>Edward C. Zimmermann</name>
> > >     <date>11 Feb 2000</date>
> > >    </author>
> > >   <edition>2nd</edition>
> > >   <date>3 July 2010</date>
> > > </book>
> > > 
> > > The element book contains title, author, edition, date etc.
> > > In the "same container" would however mean exclusively the lowest node
(leaf).
> > > Containers should have no sub-elements
> > >   Edward and Zimmermann are in the same container (book/author/name>)
> > >   Edward and CQL are not in the same container even if, per your 
> > > semantics the container author as two sub-elements.. whence the same 
> > > container would be the same "author" container.
> > > 
> > > In my IB engine the semantics for "in the same container" only 
> > > applies to the lowest.. e.g. bookauthor/name instance or
book/author/date instance..
> > > If I want to search for Edward and SQL in the same book then I 
> > > search for it by name WITHIN:book  or explicitly in path 
> > > WITHIN:\book  (I use \ for / as / is used to denote fields). That is:
elements (and paths).
> > > 
> > > In my above example:   2000, Zimmermann are in the same author.. they
are also
> > > within the same book..  CQL and Zimmermann are within the same book 
> > > but not within the same author. etc...  2010 and July are in the 
> > > same date. July and 2000 are not in the same date but are in the same
book.. etc.
> > > 
> > > http://www.ibu.de/RelationalHierarchicalIR
> > > http://www.ibu.de/IB_Query_Fields
> > > 
> > > --
> > > 
> > > Edward C. Zimmermann, NONMONOTONIC LAB Basis Systeme netzwerk, 
> > > Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net
> > > Umsatz-St-ID: DE130492967
> 
> --
> 
> Edward C. Zimmermann, NONMONOTONIC LAB
> Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
http://www.nonmonotonic.net


--

Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
Umsatz-St-ID: DE130492967