> I tried using the index "dc.title" with record schema "dc".
> But, I get disconcerting results--most of the retrieved
> records do not have my search term value in the XML element
> called "dc.title". On the surface, this looks badly broken.
When I was playing (remotely) with the new Aleph server at the BL, I did a
search for title all "xml language" and got back a record that had neither
word anywhere, let alone in the title.
The perils of data transformation is that information gets lost or
translated badly.
> Larry Dixson explained to me:
> "Our title index is pretty broad. Below are the MARC
> ...
> I guess that a "correct" solution here might be for LoC
> to concatenate the values of all fields searched when
> constructing the value of the XML element "dc.title".
> Of course, that would be quite an ugly record and not
> really following the spirit of the "dc" record schema.
IMO, if those fields are all titles, then they could be included in the DC
record as such. No reason there can't be multiple titles per record.
If you had searched for 'dc.title any fish' with schema of MarcXML, and
gotten back the full record rather than the dumbed down DC version, it
would have been more obvious why that particular record matched, but the
issue is more that if you're mapping a field to a dc index, then it should
also be mapped to the same dc schema element as they do have the same
semantics.
Should this be a recommendation somewhere? If only in the FAQ?
> My feeling is that the problem lies more with the abstract
> access point. Given the mapping that Larry has for the
> LoC MaRC fields, perhaps the index ought to be named
> simply "title" rather than "dc.title". That way, there
> is no expectation that the abstract search index should
> be an exact match to the record schema element.
> In conversation with Sebastian, it turns out that the
> IndexData search actually _does_ behave that way: one
> can search on "title" as though that index name were
> a CQL built-in index. Very Cool!
I assume that it follows the CQL rule that if there isn't a context set
part to the index, that it uses the default context set (which I then
assume is dc) So while the search looks slightly different, the semantics
are still that it's searching dc.title.
Rob
,'/:. Dr Robert Sanderson ([log in to unmask])
,'-/::::. http://www.o-r-g.org/~azaroth/
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. University of Liverpool
____/:::::::::::::. L5R Shop: http://www.cardsnotwords.com/
I L L U M I N A T I
|