In response to Alan's questions:
1.
I do want a single concept of "title" and I encourage the use of qualifiers as in Dublin Core (with its "dumb down" rule), but qualifiers and namespace prefixes are different things!
2.
With respect to your 3 solutions I would like to add a 4th solution. Try to stay in line as much as possible with Dublin Core names (and available application profiles) and allow qualifiers (indicating refinements or encoding schemes) that sometimes - but not always - can be neglected (dumb down). Avoid namespace prefixes as they really are a pain. Use explain to relate index names to attribute sets (optional).
3.
I will encourage both usage U1 and U2: 1) sending a query to multiple servers with agreed index names and 2) sending queries to local systems with specific locally known index names. I encourage the use of explain for advertising these local index names.
4.
I do not know the expectation of a user when he searches for "author" and I do not know an unambiguous way to find out. But I do know that the introducrion of attributes sets will not solve this problem. Each server has to offer indexes that intuitively best match to the index names (with optional qualifiers but without namespace prefixes)
5.
In general we should not mix up namespace prefixes, qualifiers and attribute sets. I always assumed that attribute sets were intended to agree on the minimum set of attributes that servers should support and clients could rely on. I never knew that they were intended to modify the meaning of attributes.
6.
With respect to the second point in your summery:
Local index names should not need Explain but may make use of Explain. In my view it is sufficient when applications can use the information in Explain to present a list of (extra) index names with their meaning. I do not know what the added value of a URI would be for that purpose.
7.
With respect to the third point in your summery:
In general it may be good to have a flag indicating the different types of responses or behaviour but we should only introduce such flags if it can not be solved by flexibility of the client or the server. I mean for example: we should not use flags to prevent the server sending fields that could also easily be ignored by the client.
In your example I prefer to return a tag indicating that an index name does not exist or - when the error message is that important - the client should check Explain first.
Theo
>On Wed, May 15, 2002 at 02:21:38AM +0200, Theo van Veen wrote:
>> First I have to say that I appreciate the work being done on CQL and
>> Explain. Nevertheless I think that we should make use of some new
>> opportunities now we are defining a new query language.
>
>I certainly agree its worth arguing through issues - it is the best
>way to solve problems early on rather than later.
>
>> First as a reaction on Ralph:
>>
>> >So, if I support both dc.title and bath.title and you send me
>> >unqualified title, what do you expect me to do? It just so happens
>> >that I specified in my explain record that the default index set was
>> >bath, but what if you were expecting it to be dc?
>>
>> What should the client do in this case? Explain to the user that
>> there are different sorts of titles? Or just make an arbitrary choice
>> for the user?
>
>As I understand it, you (Theo) want a single concept of "title" in CQL.
>(Please correct me if I am wrong!) If there is a single concept of
>"title", then you don't need qualifiers.
>
>The problem that others (including me) have expressed is that there is
>not a single definition of "title". In Dublin Core there are defined
>semantics for "title" (the name of a book etc). However, in another
>application "title" might mean "Mr/Mrs/Ms/Dr/Sir" etc. Using prefixes
>is therefore proposed to qualify "title" (such as "dc.title") to
>disambiguate the meaning of "title".
>
>There *are* multiple solutions to the problem (and using a prefix is
>only one of them).
>
>(1) Come up with a global namespace for *all* concepts (without prefixes)
> and the first person to come up with a meaning of "title" gets to use
> that name, and any later meaning that comes along needs to use a
> different name (e.g. "formal_title").
>
>(2) Use explain on a server to work out what a particular server means
> by "title". That is, don't use the name "title" to work out the meaning.
> Instead have some other way of identifying the concept (such as a URI)
> then use explain to work out which index name a particular server
> uses for that concept (so I would look for "http://dc.org/title..."
> and find it was mapped to "title", but "http://human-name.org/title"
> was mapped to "formal_title").
>
>(3) Introduce a prefix so a prefix is allocated to a semantic area where
> names must be unique in that area. Such as "dc" for Dublin Core.
> This could be viewed as a variation of (1) above - the names have just
> got longer "dc.title" instead of just "title". But there is a formalism
> to it - you must be allocated a unique prefix, then your group can
> define names under that group. (This is in effect what XML namespaces
> do by the way - but we are using a short prefix instead of a long
> URI to identify the namespace).
>
>There are lots more variations I am sure.
>
>Now there are also different usages of CQL.
>
>(U1) A person has a single server they talk to all the time, and want to
> express queries using the full capabilities of that server.
>
>(U2) A person wants to write a single query and send it to multiple servers.
>
>I want to support both nicely.
>
>Theo, question 1: do you think I have captured the different alternatives
>correctly (with no comment on which is best - I just want to make sure I
>am understanding the conceptual model that you want, and the models that
>you do not want).
>
>Of the above, I am against proposal (2) because I want to write a
>single query that has a chance to work against multiple servers. If I
>have to use explain, then I have to rewrite the CQL query per server.
>
>I dislike (1) (a global namespace) because that is against the trend
>of what Dublin Core etc are doing. I think its important to be able
>to segregate the namespace of indexes.
>
>However, to support usage U1, using qualifiers all the time *is* a pain.
>I like being able to define local index names and not have to define
>a public table and register a prefix etc. These names are frequently
>not intended for cross collection searching. So I like the mix of
>being able to define indexs with standard prefix names (with standard
>semantics) and local unqualified names for which I can define my own
>semantics as best suits the database I am building.
>
>> dc is defined for description and not for searching.
>
>I am sorry, I don't follow your point here. I would have thought that
>describing/categorizing data is directly relevant to searching.
>
>> But if it is supposed that a user will have a general
>> understanding that dc.author means author, because he has an
>> understanding of author, the prefix is not relevant and even
>> misleading.
>
>I think what people (including me) are saying is that "author" is
>ambiguous unless you come up with a single definition of what "author"
>means. Going back to the "title" example above, I think its clear there
>is not an intuitive single definition of what "title" means to all
>people. It would be a matter of specifying for CQL what "title" or "author"
>means. So I disagree with the assertion that a simple index name
>such as "title" or "author" is a clear definition of what the semantics
>of the index are. I think the Dublin Core activities have demonstrated
>this well. The started with 15 core elements, but soon realised that
>life is not that simple, and simple names they first came up with
>were not enought. So they introduced "qualified Dublin Core" with more
>names.
>
>> > > In my point of view not supporting Ralph's premises means
>> > > not supporting prefixes. Or did I misunderstood previous
>> > > discussions and
>> > > is everyone already on this track?
>> > Yes, I think you misunderstood. I believe the consensus is
>> > this:
>> > 1.There will be some well-know prefixes, e.g., bath and dc, and
>> > you won't have to use Explain to discover a server-specific
>> > definition for these.
>>
>> In this case a client has to know the prefix exactly. Searching for
>> "dc.title:abc or bath.title:abc" will return an error message if one of
>> both is not supported.
>
>Exactly. I think its better to report an error if a query has specified
>something that a server does not know than return an incorrect result
>because the server has misinterpreted the query due to different semantics.
>
>> > 2.A server is free to define server-specific prefixes (as
>> > long as they don't clash with the well-known prefixes) and you
>> > might have to use explain to discover those.
>>
>> In distributed searching I do not think any client will search for
>> prefixes or indexes that it doesn't know.
>
>Of course. If it does not know the prefix, it cant use it by definition!
>But Explain gives a mechanism of learning about prefixes and indexes
>the client did not know before. The simplest illustration is a client
>that does an Explain query on a server then displays all returned values
>to the user in a drop down list. Each index name has a human readable
>description along with it. The client application does not "understand"
>the different index names in this situation - the human does though.
>
>> > 3.You can send an index name
>> > without a prefix, but in that case the server applies the default
>> > prefix, and you'll need to use explain to find out what that is for
>> > a given server (there won't be any global-default).
>>
>> This is all I want: reasonable defaults. But I am not able to write
>> clients that are intelligent enough to find out whether the servers
>> default corresponds to the users expectations.
>
>Ahhhhh! Does this mean then that you are not opposed to prefixes, but
>rather all you want to ensure is that a database can be defined without
>them. That is not all index names *must* be qualified? I certainly
>agree with this. I think a database should be able to support a set
>of qualified index names (with standard prefixes) AND a set of unqualified
>names.
>
>Is the challenge therefore in your eyes working out what these unqualified
>names mean? (Eg: does "title" mean title of a book versus Mr/Mrs/Dr etc).
>Is a human readable description enough? Or a URI? Or the Z39.50 attribute
>list it binds on to? Or put another way, what unambiguous way can you
>think of that defines what a user expectation is? This is an important
>question to answer.
>
>> > 4.Distributed searching is theoretically possible, but all indexes
>> > should have well-known prefixes. (Or, you could send non-
>> > prefixed indexes to different servers but you cannot assume
>> > that they mean the same thing to different servers.)
>> > --Ray
>>
>> What (default) prefixes should be used in distributed searching?
>
>I think Ray's point above is that if you want to write a query and
>have it sent to mulitple servers and guarantee those servers use
>the same meaning as you intend, that there is no default prefix that
>can be used.
>
>If a database can support both qualified (formal, standard definitions)
>and unqualified (locally defined) index names, a distributed query using
>only unqualified names can still work *if* the query is being sent off
>to multiple servers that are known to support the same locally defined
>names. I think the argument is that in the case where you want to send
>a query of to lots of servers where they do not share the same locally
>defined names (because they are locally defined), then using prefixes
>avoids a server misinterpreting a query.
>
>> Ralph will return an error message if I try "dc.title:abc or
>> bath.title:abc".
>
>Me too. I would never write a query using both though. I would write
>a query using only one of them.
>
>But an alternative here is to add a flag when a CQL query is submitted
>saying "report error on unknown index names" versus "ignore unknown
>index names". (By ignore, I mean return zero matches for that term - sort
>of like NULL in relational databases.) I can see the merit in this.
>Or even introduce a new symbol or something in CQL indicating for
>a index name the behaviour to take (zero matches or error) so the
>person writing the query has control - but I think a boolean flag
>being sent along with the query is better.
>
>> I have the strong feeling that we are currently on the wrong track.
>> We are mixing up Z39.50 attribute sets with dc name spaces, while
>> the solution is quite simple: use user understandable names for
>> search indexs. It is possible in Dublin Core for description, why is it
>> not possible in CQL for searching?
>
>Dublin Core only gives one semantics of "title". I think if you asked
>them Dublin Core would agree that their semantics *is not* the only
>definition, or even the best. Its just a definition they have agreed
>with. This is why they use XML namespaces to qualify their elements
>in XML encodings. They do not, for example, claim their interpretation
>of "title" is the best one so their one does not need qualification.
>
>So I think we should support qualifiers in part *because* Dublin Core
>do it too.
>
>> The abstract Z39.50 attributes were usefull in case of MARC
>> descriptions, but in line with Dublin Core I think we should map the
>> Z39.50 search attributes to user understandable names instead of
>> sticking to the attributes.
>
>I think we are all in agreement here. We want textual names in queries.
>The question regards to unambiguous agreement to what a textual name means.
>
>> Theo
>
>I think there are some very interesting issues to have come out of this.
>In summary:
>
>* I think a database should support both prefix qualified index names
> (with globally defined and agreed to semantics) and unqualified
> index names (locally defined semantics).
>
>* For a locally defined index name, how to unambiguosly define its
> semantics? Human description? URI? Z39.50 attribute list?
> ZeeRex records I think would allow a human description and an
> attribute list.
>
>* Should SRW have a flag to be sent in a query to define the behavour
> for unknown index names? (Ignore versus report error versus server
> can do whatever it feels like etc.) I can see the logic in this for
> distributed queries. If SRW picks a single semantic however, I think
> it should be to report an error.
>
>Alan
>
>--
>Alan Kent (mailto:[log in to unmask], http://www.mds.rmit.edu.au/~ajk/)
>Project: TeraText Technical Director, InQuirion Pty Ltd (www.inquirion.com)
>Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001.
>Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia.
>Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098
>
>>
|