## [email protected]

#### View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Proportional Font

Subject:

cql relation proposal

From:

Z39.50 Next-Generation Initiative

Date:

Mon, 30 Sep 2002 13:58:48 -0400

Content-Type:

text/plain

Parts/Attachments:

 text/plain (74 lines)
 ```I propose the following for the cql relation. Having read everyone's position, I know this won't completely satisfy anyone, but it seems a reasonable compromise to me. dc.title matches (string) means exact match dc.title = (word1 word2 word3) means adjacent words dc.title ~ (word1 word2 word3) means similar words dc.title * (word1 word2 word3) means all words dc.title + (word1 word2 word3) means any words dc.title stem (word1 word2 word3) means stem/any words dc.title fuzzy (word1 word2 word3) means fuzzy/any words Points 1. There's no reason to say, e.g.: "dc.title =~ (word1 word2)" when "dc.title ~ (word1 word2)" will do just fine. That should solve the problem of what does >~ mean. 2. Further on the equal-sign, that the current cql definintion says: relation ::= numeric-relation|"fuzzy"| "stem"|"relevance" numeric-relation::="<"|">"|"<="|">="|"="|"<>" By this definition "=" is a numeric relation. I strongly feel that it should either be a numeric relation or used for strings/words, but not both. Above it's the latter, which would mean take "=" out of the mathematical relation list. Or can someone give an example where we need mathematical equality? If so then we should come with an alternative symbol for "=" for word adjacency. 3. Implicit in this proposal is that we don't have separate (abstract) word and string index names. i.e. just dc.title, not dc.titleWord and dc.titleString. This would apply to all dc elements. Bath would remain as defined. 4. For stem and fuzzy, I don't know if it should be any or all (pick one). If it's any, then if you want to do all you have to construct booleans. 5. Robert Sanderson wrote: > Designing CQL such that it maps onto attributes > rather than being easy to > understand and construct is counter productive > IMO. I see no reason why we can't do both, giving priority to the latter, and modifying AA if we find it necessary. For example I don't believe we need to tokenize an anchor (i.e. take it out of the term) to be able to cite how it maps to Z39.50. We simply need to have rules for turning a cql query into a Z39.50 query. One of those rules could be "if there is a left-anchor character at the beginning of the field, turn that into first-in-field". --Ray ```