> >It's not just EAD though.  There's no restriction in SRW that you can't
> >use it for only short metadata records... Imagine someone with (say) full
> >text journal articles.  Or full texts of entire books.  Or SVG even?
> >Search through complex structures and return instances of circles. (or
> >whatever SVG primitives are like)
> But is this a real application? Would someone really put up an SRW
> server which when I search for Shakespeare would return the full-text of
> books which mention Shakespeare? As a user I would expect the server to

Amazon are working on providing full text of their books. I would expect
that eventually they'll want to have it searchable.   But your argument is
also in favour of my position... you likely /don't/ want to return the
full text of the book, you want to return a chapter or a paragraph or ...
meaning you want to be able to extract these from the record.

Here's another full text example:

Search for medieval manuscripts that have X Y and Z in the rubrics.  That
means that we need to have the entire manuscript's text in one 'record'.
But who knows what metadata or text the application wants to display.
Some might want the location it was written, others might want the incipit
or explicit, others might want the full text of a chapter, of a folio or
of a quire.  This is certainly a real application, as I do it now with
MSS using Z39.50.

Here's another one:  X3D.  Search the 3d models for (whatever) and return
only the section that matches xpath (xpath).  Perhaps you want to display
column capitals from egyptian temples (Which people do want to do now, as
there was a paper about it at JCDL) but your models are of entire temples
(which they were in the paper IIRC).

> >you just use
> >XPath at the server... the server that is supposed to know its own data.
>  But in the example you quote the data I really need to use some
> free-text searching in the XPath specification i.e. the XPath might be

That's one use if you want to find the paragraph that matched your term
(which may or may not exist, given stemming etc, so I don't think this is
a very sensible use of XPath anyway)  But there are a bazillion other uses
for it, a very few of which I've given as examples to date.

Another solution would be to limit the searching via xpath (or other), eg:
treat a /paragraph as an individual record.  Then you could return only
the matching /paragraphs.  But that implies a search schema as opposed to
a retrieval schema, which is a concept that we don't have in SRW.  But I

> >That's another question, certainly, but the current one remains and
> >applies to non EAD-like structures as well.  You wouldn't expect a book to
> >be broken down arbitrarily into paragraphs just to make it easier to
> >return them individually...

> If I'm interested in sections which mention Shakespeare then yes!
> otherwise SRW is useless for my search requirement! If I'm interested in
> books about Shakespeare then I'd probably expect more traditional OPAC

So you'd take a book, split it up into paragraphs, put it in a big
database with all the paragraphs from other books? So you then couldn't
retrieve books that have the search terms in different paragraphs.  You
couldn't search by page number.  Unless you duplicate lots of metadata,
you couldn't even search by author/title/publisher/etc.  You couldn't
reconstruct the books without even more metadata attached, nor drill up
or down to chapters and sections without more metadata... at which
point why are you splitting up the nicely structured XML that enables all
of this without any issues?

You've come up with one situation where XPath is not suited to your search
requirement and given an alternative (the status quo of small metadata
records). I can come up with other situations where I wouldn't want to use
XPath as well. You need to come up with an alternative way in which I can
fulfil my example search requirements which is better than XPath.

Currently the contenders are awful:

a)  Split up each record N different ways where N is the number of
different elements multiplied by their occurence within the record (eg
quire/folio/page/paragraph/rubric/heading) and then duplicate lots of
metadata into these new sliced up records.  Didn't work for the PRO's EAD
implementation, won't work in SRW either.

b)  Create an uncountably large number of proprietary schemas just so each
application can retrieve the data they're interested in and then get all
of the servers to support all of these schemas.


c)  Allow an optional XPath to be supplied and return the element(s) which
match in the record.  This can be done without any effort on the content
provider's part nor the application writer's part and only a modicum of
extra effort on the server developer's part to insure that functions are
limited if security is an issue.

if ( (idx = indexOf("(", xpath)) > -1 && indexOf(")", xpath) > idx) {
  # Probably a function, deny.

I'm sorry, but I just don't see how we can -not- allow XPath, given the


      ,'/:.          Rob Sanderson ([log in to unmask])
  ,'--/::(@)::.      Special Collections and Archives, extension 3142
,'---/::::::::::.    Nebmedes:  telnet: 7777
____/:::::::::::::.                WWW: