On Tue, 6 Jul 2010 16:52:26 +0000, Peter Noerr wrote
> I have only one question:
>
> Why on earth would one prefer to define an "element" as a "thing
> which contains other things", and "container" as a "thing which
I would suggest that to ask about information "in the same container" if
"container" means element makes no sense..
Elements can be nested.. What's
<parent><child><grandchild> .. </grandchild></child></parent>
?
Is the value of grandchild also contained in child and parent?
And to the other extreme we have empty elements.. <empty></empty> (or <empty /> )
In SGML/XML an element can have many or no children.
We can talk about atomic elements as an element that has no children but may
or may not have content--- but we, of course, are only interested in elements
with content..
In SGML/XML we have "content" in the value of the element and its attribute
values:
<element attribute="attribute value">Element value</element>
Now.. We should be interested in information structures even more abstract
than SGML/XML (e.g. overlaps etc.) and so things can get even more twisted..
How would I ask the question that gives me the answer "Edward Zimmermann"
below:
<person><name>Edward Zimmermann</name><address><street>Leopoldstrasse
</street><city>Munich</city></address> </person>
In the same name instance? But what if I don't know that the "container" is
called "name"?
In above are not Zimmermann and Leopoldstrasse contained in the same person
element instance?
How should I distinguish these two? Zimmermann is contained in name and
Leopoldstrasse is contained in street, street is a child of address and both
address and name are siblings of person..
These are the kind of questions we ask all the time.. right? Give me the
Zimmermanns on Leopoldstrasse in Munich. Give me cookbook's written by
Schiller.. Tell me which books were edited by Johann Wolfgang Goethe?
Sure if we know the questions ahead of time we can structure (or transform) a
database with the structure to make such queries easy (and flat).. But then
the next questions which don't fit that structure?
And when we start to talk about heterogeneous data.. that is when the user is
not even 100% on the structure they can't ask for "street" or "person" but
they'd like to still ask the question.. By talking about anonymous
containers.. etc. Its possible..
> contains no other things" when this is *exactly* the reverse of
> their normal English usage? Is there a perverse desire here to make
> things (pun intended) as confusing as possible?
Not really.. Think about a shoe shop.. They sell pairs of shoes stored in
boxes.. and that box of shoes is stored on a shelf and that shelf has an
address. Is the container the room where these shelves are? The shelf? The
level in the shelf? Or the shoe box?
In this model.. The elements are the shop, the room, the shelves, the specific
shelf, the box.. we have a path of elements to define an address to find the
container which contains a specific pair of shoes.. When we talk here of
container we mean the box (which contains a pair of shoes).. even if the box
is contained in the shelf, in the room, in the shop, in the house, ...
>
> Peter Noerr
>
> > -----Original Message-----
> > From: SRU (Search and Retrieve Via URL) Implementors
[mailto:[log in to unmask]] On Behalf Of Edward
> > C. Zimmermann
> > Sent: Saturday, July 03, 2010 4:07 PM
> > To: [log in to unmask]
> > Subject: Re: CQL example for prox/unit=
> >
> > Ray, I'm actually quite pleased. I feel almost as if someone has been
> > listening! There are, however, just a few points still that need to be
voiced..
> >
> >
> > On Sat, 3 Jul 2010 15:40:52 +0100, Mike Taylor wrote
> > > Surely "container" here means EXACTLY the same thing as "element",
> > > i.e. whatever you want it to mean?
> > >
> >
> > In SGML containers are called elements. When we want to distinguish between an
> > element and the containers it contains we can talk about container thus they
> > are, in semantic use, not always the same. See below.. (my argument is that
> > they have things backwards).
> >
> >
> > On Fri, 2 Jul 2010 17:55:02 -0400, Ray Denenberg wrote
> > > Peter, that's an example of structured proximity searching and that
> > > has been abandoned in the OASIS CQL spec, the most recent draft of
> > > which is at http://www.loc.gov/standards/sru/oasis/current/cql.doc
> > > which I recommend that you look at, because it clarifies (and
> > > simplifies) proximity.
> >
> > On Sat, 3 Jul 2010 09:31:40 -0400, Ray Denenberg wrote
> > > Just want to add this. The problem isn't so much in saying "find A
> > > and B in the same element", the problem is when the distance is
> > > greater than zero as in "find A and B separated by two elements".
> > > In the OASIS spec, the notion of element is discarded and the more
> > > abstract notion of "container" is defined, and you can "find A and B
> >
> > I would argue that the predicate "container" should be used as a name for a
> > specific kind of element, namely atomic containers, viz. containers with no
> > sub-containers (indexes). An element here would be a container when it has no
> > sub-elements. In a typical flat format such as common to Z applications the
> > fields are elements and containers..
> >
> > > in the same container" (with no notion of finding A and B separated
> > > by, say, two containers). --Ray
> >
> > Correct. Since in XML and many other models there is no order one can't talk
> > about distance. The only distance one can talk about is bytes according to how
> > the information was marked-up but only if the information was marked-up.
> >
> > The doc: "A container is a structure containing one or more indexes. For
> > example the server may support a container whose name is
[UTF-8?]‘author’ that
> > contains indexes [UTF-8?]‘name’ and [UTF-8?]‘date’. In that case
the server would support a
> > query (see example) to find an author with a specific name and date. (This
> > is contrasted with a Boolean query which may return undesired results because
> > they have multiple authors, some of which have the desired name but the wrong
> > date and others the specified date but the wrong name.) The server should list
> > supported containers in its Explain file, and for each container, the indexes
> > that it contains."
> >
> > Exactly. One has here named sub-paths author/name and author/date but also
> > the anonymous path "in the same container". I, however, would swap around the
> > semantics and view "container" as the lowest node and element any .. example:
> >
> > <book>
> > <title>Re: CQL example for prox/unit=</title>
> > <author>
> > <name>Edward C. Zimmermann</name>
> > <date>11 Feb 2000</date>
> > </author>
> > <edition>2nd</edition>
> > <date>3 July 2010</date>
> > </book>
> >
> > The element book contains title, author, edition, date etc.
> > In the "same container" would however mean exclusively the lowest node (leaf).
> > Containers should have no sub-elements
> > Edward and Zimmermann are in the same container (book/author/name>)
> > Edward and CQL are not in the same container even if, per your semantics the
> > container author as two sub-elements.. whence the same container would be the
> > same "author" container.
> >
> > In my IB engine the semantics for "in the same container" only applies to the
> > lowest.. e.g. bookauthor/name instance or book/author/date instance..
> > If I want to search for Edward and SQL in the same book then I search for
> > it by name WITHIN:book or explicitly in path WITHIN:\book (I use \ for / as
> > / is used to denote fields). That is: elements (and paths).
> >
> > In my above example: 2000, Zimmermann are in the same author.. they are also
> > within the same book.. CQL and Zimmermann are within the same book but not
> > within the same author. etc... 2010 and July are in the same date. July and
> > 2000 are not in the same date but are in the same book.. etc.
> >
> > http://www.ibu.de/RelationalHierarchicalIR
> > http://www.ibu.de/IB_Query_Fields
> >
> > --
> >
> > Edward C. Zimmermann, NONMONOTONIC LAB
> > Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
> > http://www.nonmonotonic.net
> > Umsatz-St-ID: DE130492967
--
Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
http://www.nonmonotonic.net
|