> Date: Thu, 29 May 2003 15:39:49 -0400
> From: Ray Denenberg <[log in to unmask]>
> I don't have a capsule summary yet. Is this limited to a linear
> range search, as Ralph was inquiring about, or are we looking for a
> more general solution, that might require us to collaborate with our
> Geo friends? And for the latter, is it a bounding polygon problem
> or something more complicated? I don't think Rob's
> Great-Lakes/(partial)Enclosing-Countries example can be solved by
> bounding polygons.
I don't think we necessarily need to propose an all-purpose GEO
solution at this stage, but we do need to make sure that whatever we
do propose is not incompatible with the kinds of generalisations that
these people will need.
A simple (to describe) but potentially awkward (to implement) proposal
might go as follows:
We introduce a new relation, "within", into CQL, to be used
only with appropriate indexes. The search-term used with the
relation indicates a range in n dimensions -- e.g. a linear
range (perhaps between two dates), an area (perhaps bounded by
lattitude and longitude values, or by points specifying the
outline of an arbitrary 2d polygon) or a volume. The
interpretation of the search-term is dependent on the access
The awkwardness of implementation here arises solely from the last
sentence, which also makes me nervous on CS grounds. It seems
fundamentally wrong to me (as well as imposing an unreasonable burden
on attribute-set developers) that the access-point should determine
the interpretation of the term. Consider the searches:
foo.numericValue within "24 29"
foo.geographicalPoint within "22n,78e 24n,82e"
A CQL-to-Type-1 converter would need to do fundamentally different
things with the RHS dependent on what index is used. Worse, the
people defining the "foo" index-set are required to define syntax for
So what's the alternative? I can think of two. One is that we define
a rigorous grammar for range-search terms, and the the terms
themselves then make clear how many specifiers they contain (2, 3,
29), what kind of thing each is (a real number, a date, a point in
3-space) and what the relationship is between them (search along a
straight line between them, search in the n-space quadrilateral
defined by them, etc.) I think we can all agree that this approach is
frighteningly complex and very unlikely to reduce to something we can
live with in the simple and common cases such as date-range searching.
Plus the parser for search terms would quickly come to rival in
complexity that of CQL itself!
It seems to me that a better approach would be to specify most of this
information in relation modifiers -- a concept that we already have
and which fits very neatly, not least because we can apply several
such modifiers to each relation. So:
foo.date within/linear/date "1968-03-12 1998-03-18"
foo.age within/linear/integer "5 33"
foo.coords within/rectangle/point "22n,78e 24n,82e"
foo.coords within/polygon/point "10,5 12,7 14,3 13,7 9,8"
And to make life more pleasant, we'd want to define the default
semantics, when no relation modifiers override them, as "linear" and
"the sub-terms are either ISO-format dates or integers". Which of
course means that the cases we really want to work will do so with
foo.date within "1968-03-12 1998-03-18"
foo.age within "5 33"
How does that look?
/o ) \/ Mike Taylor <[log in to unmask]> http://www.miketaylor.org.uk
)_v__/\ "I was on an [email] list with Tom Clancy once. Mr. Clancy's
contribution to the list was, 'Write the damn book'." --
Listen to my wife's new CD of kids' music, _Child's Play_, at