> Date: Mon, 13 Dec 2004 14:45:40 +0100 > From: Hedzer Westra <[log in to unmask]> > > Then, is there any difference between > a. idx = term1, and > b. idx =/cql.string term1 > where term1 only contains 1 word Yes, absolutely! The former would find records where idx has the value "term0 term1 term2", but the latter wouldn't. Or did you mean to ask about the situtation where the _record_ only contains one word (in the field that's indexed by idx)? If so, then, roughly, no: there's no difference. Although one could imagine servers that treat them differently if, for example, the one-word field has whitespace at the start or the end, or funny punctuation, or something like that. > [Is there any difference between] > a. idx = term2, and > b. idx =/cql.word term2 > where term2 contains multiple words. None at all: recall that the "word" structure is the default for the relation "=", so specifying "/cql.word" on this relation is redundant. Same applies for "exact/cql.string": the modifier there is also redundant, and for the same reason. To say the same thing another way: "=" means "=/cql.word" "exact" means "exact/cql.string" which means that you could (if you perverse) use: "exact/cql.word" when you mean "=" "=/cql.string" when you mean "exact" (Actually, the latter of these is not particularly perverse. Another way to think about this is that CQL has only one core equality relation, "=", which does word matching, but also provides "exact" as a convenient shorthand for "=/cql.string".) And remember: all of this is entirely to do with the interpretation of the term _structure_. It's orthogonal to the issue of whether pattern matching is done (and, if so, what kind). >> A cql.string is an opaque set of characters that the server should >> not try to interpret. > > Does this *only* refer to word separation, and nothing else? Yes. > The context set currently defines five 'data types' (word, string, > number, isoDate, uri). Should all terms be assigned exactly one of > those? Hoo, that's a tricky one! I offer this the following "answer" for discussion, not as a definitive statement: I think the way to think about this is that "string" and "word" structures are fundamentally different from each other in that the former should not be broken into words, and the latter should. The others seems to me to be either subtypes, or orthogal to this key dichotomy. More likely the latter: one can imagine situations where you'd want to search only for an exact, complete, URI, and others where you want to do keyword searching on the URI (e.g. to discover all the URIs from a specified domain). > Is there a distinction between terms that are *not* assigned any > type (either in the search query or by the server), and terms that > are typed 'string' (except for multi-word '=' searches without any > modifiers?) "exact" induces the "word" structure (unless overridden by an explicit relation modifier). Similarly, "=" induces the "string" structure (unless overridden by an explicit relation modifier). A better question would be this: what structure should "<" and the other inequality relations induce on their terms? >>> Too bad there isn't a separate spec for sorting on context set >>> indexes. >> >> That way, Z39.50 lies :-) > > You mean SRW being Z39.50 all over - something like a bulky, > difficult to implement protocol? Well, I reject that description of Z39.50. But what I meant was the one of the ways in which Z39.50 is perceived to have failed is in allowing lots of different ways to express things. SRW and CQL try on the whole to give you just One True Way. At present, for specifying sort keys, that's XPath; but for whatever it may be worth, I share your disquiet about that choice. (It's stupid that you can find records matching "author=lewis" without needing to know where the "author" field is in the XML records, but you can't the sort that set on title without knowing where the "title" field is.) >> To expand upon Mike's typical one-liner, the problem is that then >> you have to include the entire search clause > > What do you mean by that? I hardly know anything 'bout Z39.50, maybe > that doesn't help here.. Then just forget it -- really. That comment was really just a throwaway for other Z39.50 holdovers such as myself. If you're new to this stuff, do yourself a favour and just think about SRW and CQL. > > (or attribute combination for Z) and the only thing that you can > > search by are indexes, rather than relatively arbitrary data. I'm > > not (personally) averse to reworking the sort definition for 1.2, > > so if you have any concrete ideas, put them forwards :) > > The only thing that comes to my mind right now is starting the > sortXPath with an escaping character (preferably an XPath-illegal > char, making the distinction clear) and then follow with an index > name. What you're trying to do here makes sense, but the _way_ your suggesting here seems unnecessarily hacky. I think we can do better. I have a thought on this, but will float it in a separate message so as to avoid thread-congestion. >>> I retrieved the msg from the archive and got CQLJava which >>> contained a set of XSLs which turn IE into a SRU browser. >> >> CQL-Java contains that? Really? I would actually like to find >> these XSLTs. Where did you get them? > > I retrieved a ZIP from the OCLC website (don't know the location > anymore) with a lot of JAR and Java files, and some XSLs in the > basedir. Ha! Ralph, is OCLC distributing a derived work of CQL-Java now? (It's perfectly entitled to do so, of course, but it would have been nice to know.) > See the attachment for my updated XSLs. Thanks! _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <[log in to unmask]> http://www.miketaylor.org.uk )_v__/\ "``user-friendliness'' is over-rated. If you're not willing to learn anything new, you can never use the computer to its full potential" -- attributed to Douglas Egglebart, inventor of the GUI. -- Listen to free demos of soundtrack music for film, TV and radio http://www.pipedreaming.org.uk/soundtrack/