On Thu, 8 Jul 2010 07:52:29 -0400, Ray Denenberg, Library of Congress wrote
> Ed, although I take Peter's point and probably agree - that
> "container" is probably a more semantically appropriate term than
> "element" for its proposed use - I don't recall that anyone even
> brought up that point in the discussion, there was an entirely
> different reason altogether: that "element" is too suggestive of xml,
> and "container" is more neutral. And though I also take Mikes point
Container is not neutral. Nor is element. Element is about membership.
Container is about possession.
> that "element" can mean whatever you want it to, when someone says
> "name and date in the same element" that suggests xml to too many
> people. Your shoe example is more complicated that anything we are
> trying to represent in CQL. There won't be containters within
> containters. --Ray
Of course there are containers within containers--- unless you adopt MY
semantics for container as a specific kind of element, e.g. with no
children--- and we've had them for YEARS and YEARS (thinking CAPS, GILS, DIF,
Dublin Core, ISO 19115 etc.)... We don't, I think, want to try flatten the
world as we did back in Z (forced by OIDs to avoid horrible acrobatics)---
where we even had "problems" with repeating elements.
With elements/containers like word, line, sentence, paragraph, section,
chapter, ... you have containers that contain containers--- even "worse" you
have overlap since a line can contain multiple sentences and a sentence can
span one or more lines.
And we want to search these.. and with all kinds of structure don't we?
As soon as fields have sub-fields.. we have elements within elements..
containers within containers like Matryoshka dolls.
We can talk about paths or ways of describing elements, either generically,
"name" or specific "person/name" (the name of a person).. But we also want to
talk about terms being in the same ..... (insert the word "container" or ???
for the leafs without sub-elements/containers or ..).
>
> -----Original Message-----
> From: SRU (Search and Retrieve Via URL) Implementors
> [mailto:[log in to unmask]] On Behalf Of Edward C. Zimmermann
> Sent: Thursday, July 08, 2010 4:15 AM To: [log in to unmask]
> Subject: Re: CQL example for prox/unit=
>
> On Tue, 6 Jul 2010 16:52:26 +0000, Peter Noerr wrote
> > I have only one question:
> >
> > Why on earth would one prefer to define an "element" as a "thing which
> > contains other things", and "container" as a "thing which
>
> I would suggest that to ask about information "in the same
> container" if "container" means element makes no sense..
>
> Elements can be nested.. What's
> <parent><child><grandchild> .. </grandchild></child></parent> ?
>
> Is the value of grandchild also contained in child and parent?
>
> And to the other extreme we have empty elements.. <empty></empty>
> (or <empty /> )
>
> In SGML/XML an element can have many or no children.
>
> We can talk about atomic elements as an element that has no children
> but may or may not have content--- but we, of course, are only
> interested in elements with content..
>
> In SGML/XML we have "content" in the value of the element and its attribute
> values:
> <element attribute="attribute value">Element value</element>
>
> Now.. We should be interested in information structures even more
> abstract than SGML/XML (e.g. overlaps etc.) and so things can get
> even more twisted..
>
> How would I ask the question that gives me the answer "Edward Zimmermann"
> below:
>
> <person><name>Edward
> Zimmermann</name><address><street>Leopoldstrasse
> </street><city>Munich</city></address> </person>
>
> In the same name instance? But what if I don't know that the
> "container" is called "name"?
>
> In above are not Zimmermann and Leopoldstrasse contained in the same
> person element instance?
>
> How should I distinguish these two? Zimmermann is contained in name
> and Leopoldstrasse is contained in street, street is a child of
> address and both address and name are siblings of person..
>
> These are the kind of questions we ask all the time.. right? Give me
> the Zimmermanns on Leopoldstrasse in Munich. Give me cookbook's
> written by Schiller.. Tell me which books were edited by Johann
> Wolfgang Goethe?
>
> Sure if we know the questions ahead of time we can structure (or
> transform) a database with the structure to make such queries easy
> (and flat).. But then the next questions which don't fit that
> structure?
>
> And when we start to talk about heterogeneous data.. that is when
> the user is not even 100% on the structure they can't ask for
> "street" or "person" but they'd like to still ask the question.. By
> talking about anonymous containers.. etc. Its possible..
>
> > contains no other things" when this is *exactly* the reverse of their
> > normal English usage? Is there a perverse desire here to make things
> > (pun intended) as confusing as possible?
>
> Not really.. Think about a shoe shop.. They sell pairs of shoes
> stored in boxes.. and that box of shoes is stored on a shelf and
> that shelf has an address. Is the container the room where these
> shelves are? The shelf? The level in the shelf? Or the shoe box?
>
> In this model.. The elements are the shop, the room, the shelves,
> the specific shelf, the box.. we have a path of elements to define
> an address to find the container which contains a specific pair of
> shoes.. When we talk here of container we mean the box (which
> contains a pair of shoes).. even if the box is contained in the
> shelf, in the room, in the shop, in the house, ...
>
> >
> > Peter Noerr
> >
> > > -----Original Message-----
> > > From: SRU (Search and Retrieve Via URL) Implementors
> [mailto:[log in to unmask]] On Behalf Of Edward
> > > C. Zimmermann
> > > Sent: Saturday, July 03, 2010 4:07 PM
> > > To: [log in to unmask]
> > > Subject: Re: CQL example for prox/unit=
> > >
> > > Ray, I'm actually quite pleased. I feel almost as if someone has
> > > been listening! There are, however, just a few points still that
> > > need to be
> voiced..
> > >
> > >
> > > On Sat, 3 Jul 2010 15:40:52 +0100, Mike Taylor wrote
> > > > Surely "container" here means EXACTLY the same thing as "element",
> > > > i.e. whatever you want it to mean?
> > > >
> > >
> > > In SGML containers are called elements. When we want to distinguish
> > > between an element and the containers it contains we can talk about
> > > container thus they are, in semantic use, not always the same. See
> > > below.. (my argument is that they have things backwards).
> > >
> > >
> > > On Fri, 2 Jul 2010 17:55:02 -0400, Ray Denenberg wrote
> > > > Peter, that's an example of structured proximity searching and
> > > > that has been abandoned in the OASIS CQL spec, the most recent
> > > > draft of which is at
> > > > http://www.loc.gov/standards/sru/oasis/current/cql.doc
> > > > which I recommend that you look at, because it clarifies (and
> > > > simplifies) proximity.
> > >
> > > On Sat, 3 Jul 2010 09:31:40 -0400, Ray Denenberg wrote
> > > > Just want to add this. The problem isn't so much in saying "find
> > > > A and B in the same element", the problem is when the distance is
> > > > greater than zero as in "find A and B separated by two elements".
> > > > In the OASIS spec, the notion of element is discarded and the
> > > > more abstract notion of "container" is defined, and you can "find
> > > > A and B
> > >
> > > I would argue that the predicate "container" should be used as a
> > > name for a specific kind of element, namely atomic containers, viz.
> > > containers with no sub-containers (indexes). An element here would
> > > be a container when it has no sub-elements. In a typical flat format
> > > such as common to Z applications the fields are elements and containers..
> > >
> > > > in the same container" (with no notion of finding A and B
> > > > separated by, say, two containers). --Ray
> > >
> > > Correct. Since in XML and many other models there is no order one
> > > can't talk about distance. The only distance one can talk about is
> > > bytes according to how the information was marked-up but only if the
information was marked-up.
> > >
> > > The doc: "A container is a structure containing one or more indexes.
> > > For example the server may support a container whose name is
> [UTF-8?][UTF-8?]‘author’ that
> > > contains indexes [UTF-8?][UTF-8?]‘name’ and
[UTF-8?][UTF-8?]‘date’. In that
> > > case
> the server would support a
> > > query (see example) to find an author with a specific name and
> > > date. (This is contrasted with a Boolean query which may return
> > > undesired results because they have multiple authors, some of which
> > > have the desired name but the wrong date and others the specified
> > > date but the wrong name.) The server should list supported
> > > containers in its Explain file, and for each container, the indexes that
it contains."
> > >
> > > Exactly. One has here named sub-paths author/name and author/date but
also
> > > the anonymous path "in the same container". I, however, would swap
> > > around the semantics and view "container" as the lowest node and element
any .. example:
> > >
> > > <book>
> > > <title>Re: CQL example for prox/unit=</title>
> > > <author>
> > > <name>Edward C. Zimmermann</name>
> > > <date>11 Feb 2000</date>
> > > </author>
> > > <edition>2nd</edition>
> > > <date>3 July 2010</date>
> > > </book>
> > >
> > > The element book contains title, author, edition, date etc.
> > > In the "same container" would however mean exclusively the lowest node
(leaf).
> > > Containers should have no sub-elements
> > > Edward and Zimmermann are in the same container (book/author/name>)
> > > Edward and CQL are not in the same container even if, per your
> > > semantics the container author as two sub-elements.. whence the same
> > > container would be the same "author" container.
> > >
> > > In my IB engine the semantics for "in the same container" only
> > > applies to the lowest.. e.g. bookauthor/name instance or
book/author/date instance..
> > > If I want to search for Edward and SQL in the same book then I
> > > search for it by name WITHIN:book or explicitly in path
> > > WITHIN:\book (I use \ for / as / is used to denote fields). That is:
elements (and paths).
> > >
> > > In my above example: 2000, Zimmermann are in the same author.. they
are also
> > > within the same book.. CQL and Zimmermann are within the same book
> > > but not within the same author. etc... 2010 and July are in the
> > > same date. July and 2000 are not in the same date but are in the same
book.. etc.
> > >
> > > http://www.ibu.de/RelationalHierarchicalIR
> > > http://www.ibu.de/IB_Query_Fields
> > >
> > > --
> > >
> > > Edward C. Zimmermann, NONMONOTONIC LAB Basis Systeme netzwerk,
> > > Munich Ges. des buergerl. Rechts http://www.nonmonotonic.net
> > > Umsatz-St-ID: DE130492967
>
> --
>
> Edward C. Zimmermann, NONMONOTONIC LAB
> Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
http://www.nonmonotonic.net
--
Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
Umsatz-St-ID: DE130492967
|