Thanks, I think I'm starting to get it. The two things I didn't
understand that I think I get now:
1. scan returns a list maximumTerms long (or to the end of the list)
whether the scanClause finds a match or not. scan just attempts to find
the position in the list of terms where the searchTerm falls if it exists
in the database. If it does, great, if not, here's the list of terms
around it. I've seen this work in OPAC searches but didn't really get the
concept.
2. Use emptyTerm (or 0 or !) to make sure the list that's returned starts
at the beginning of the list of all terms in the db.
Thanks!
Will
On Wed, 19 Apr 2006, Robert Sanderson wrote:
> >To meet this use case, I want to extract all the headings indexed as
> >dc.subject and display them at once (or in a paginated form). I'm getting
> >only the full dc.subject headings, not the separate terms that make up the
> >headings.
>
> Yep.
>
> You should get whatever you would get if you did the equivalent search.
>
> For example, scanning on 'dc.description any/stem a' will give you
> stemmed keywords, whereas 'dc.identifier exact a' will give you exact
> identifiers.
>
> >The SRU scan operation requires me to submit a scanClause consisting of a
> >complete CQL search clause: index, relation, and searchTerm.
>
> >I'm assuming that use of the "exact" relation in this context is intended
> >to return full subject headings containing the searchTerm, so that
> >"dc.subject exact durham" would return both "Durham (N.C.)" and
> >"Architecture -- North Carolina -- Durham". Please correct me if I've
> >misread this point.
>
> Exact is a complete string type of search. If the database has
> "Architecture -- North Carolina -- Durham" as a string which it thinks
> is a subject, then yes it should be in there.
>
> However scanning with a term doesn't return only terms that match, it
> simply sets a location within the full list of terms to start at.
>
> >But here's my real question: is there no way within the SRW/U
> >specification to extract all or an arbitrary portion of the dc.subject (or
> >other) headings in a database without first sending a term to scan for?
>
> The term in a scan clause is where to start from so that you can page
> through all of the terms. You can send an empty term, which should sort
> to the top of the list, but scanning from 0 or ! is also a good bet.
>
> There isn't a way to specify 'all terms' because for most indexes this
> is a ridiculously long list. You could however specify 10000000 terms
> and see how many the server will actually return to you. As a server
> developer, you could simply not have a limit on your upper count of
> terms that you'll return, and then query for some insanely large number.
>
> Hope that helps :)
>
> Rob
>
--
Will Sexton
Metadata Architect / Programmer
Duke University Libraries
|