From: "Robert Sanderson" <[log in to unmask]>
> I might have a large database of full text documents. They're kept as
> PDFs, but as I can't return PDF via SRW, I can only expose dublin core
> records automatically generated from the documents. However, as I can
> index the full text, you can search it.
I think we're overcomplicating. The original suggestion was simply this: if
you search on dc.title, and if you return <dc:title>, then they should have
the same semantics. If all you can serve is pdf, then you cannot serve
<dc:title> and the point is moot.
If you search on dc.title and in the response there is a (hypothetical)
element <bib1.title>, there is nothing in the suggestion above that covers
this case; the example is out-of-scope of the discussion. Nobody is
suggesting (at least I'm not) that you cannot return an element
<bib1.title>, or that you have to return an element dc.title, even though
you searched on dc.title.
But if you return <dc:title> the client should assume the semantics to be:
"The name given to the resource, usually by the creator or publisher." If
you returns <bib1.title> (again, pardon the hypothetical example) the
semantics should be "A word, phrase, character, or group of characters,
normally appearing in an item, that names the item or the work contained in
it."
Similarly if a server receives a search on dc.title it should assume that
the client meant "The name given to the resource, usually by the creator or
publisher." If a server receives a search on bib1.title it should assume "A
word, phrase, character, or group of characters, normally appearing in an
item, that names the item or the work contained in it."
(Please don't interpret my example as a suggestion that we should adopt
bib-1 semantics for srw. It's only an example.)
From: "LeVan,Ralph" <[log in to unmask]>
> Right now, SRW (and Z39.50) servers support the simplifying assumption
that
> complex title searching can be represented as DC.title searching. If this
> simplification is confusing, then we should simply stop doing it.
The simplification is confusing and we should stop doing it, for srw.
From: "Eliot Christian" <[log in to unmask]>
> It is my belief that the root problem with DC elements is that they try
> at once to be abstract (as indexes) and concrete (as record schema
elements).
Yes, and DC is a special case. There isn't another case where there is a
presumed (or an attempted presumtion of a) tight relationship between a set
of access points and corresponding retrieval elements with the same names.
The dublin core community has never been willing to talk about the
relationship between a dc element in a record and that same element as an
access point for searching. So we've taken on that responsibility. By "we" I
mean the Z39.50 community first, and now SRW. We didn't do a good job of
this in Z39.50. Please let's get it right this time.
> Obviously, it is useful to have a small set of named concepts (title,
author,
> subject, date...) as abstract search access points to be inherited into
> other context sets. I don't see why we should expect the DC set to serve
> that purpose.
I hope I understand you correctly to be saying that dc should *not* take on
that role, and I strongly agree. That was the mistake we made with Z39.50.
Hence my suggestion for a utility set (and not one similar to the Z39.50
utility set).
--Ray
|