Print

Print


On 12/27/05, Ray Denenberg, Library of Congress <[log in to unmask]> wrote:
>  From: "Mike Rylander" <[log in to unmask]>
>
> > title -> mods:mods/mods:titleInfo (also separated by title types)
>
> So 'title' maps to the entire titleInfo element, and then there is another
> index, say:
> 'title-abbreviated' that maps to titleInfo where type="abbreviated"   and so
> on?

Conceptually, that's exactly it.  Physically, all titleInfo metadata
is stored in the same table, and there is a field-type identifier that
specifies which XPath expression was used to extract the data.  Though
in practice (for normal public library patrons) we just use the
collected set of all titleInfo elements.

> How many of these do you actually support? (That's really the most useful
> information in this effort, as we are trying to come up with a realistic set
> of indexes that seem to be supported/supportable.)

The XPath is configurable ... so, any that are encoded in a MODS
document that we know we care about.

I'll just go ahead and attach the /actual/ XPath we're using .

>
> > author ->
> mods:mods/mods:name[mods:role/mods:text[text()="creator"]/namePart
> > (separated by name type)
>
> An index 'author', which maps to namePart when role = "creator". (By the way
> I assume you are assuming role terms from
> http://www.loc.gov/marc/sourcecode/relator/relatorlist.html , which also has
> 'author', in addition to 'creator' but no need for that bag-of-worms
> discussion right now.)
>

Right ;).  Although we're not currently looking for anything other
than role="creator", that is the intention.

> But what do you mean by "separated by name type"? Does this mean there are,
> in addition, indexes for
>  author - personal name
> author - corporate
> author - conference
> and that 'author' (alone) would combine them all?
>

Exactly.

>
> > subject -> mods:mods/mods:subject/* (separated by node name
> > (geographic, name, temporal, topic))
> And similarly could you list the specific subject search points you support?
>
>

No reason we couldn't support any arbitrary "subject" type, though the
four listed there are the ones that have XPath expressions at the
moment.

> > series -> mods:mods/mods:relatedItem[@type="series"]/mods:titleInfo
>
> Sounds good.
>
> >
> > keyword -> mods:mods/*[not(local-name()='originInfo')] (everything
> > except originInfo and subnodes)
>
> keyword maps to anything in the record excluding information under
> originInfo? I'm afraid I don't understand this one.  Could you elaborate?
>

The only things we don't want to search on for keyword searches are
publisher information and physical format stuff.  Including those
elements will, at least with our data, only increase (by the tens or
hundreds of thousands) the number of hits for a string that exists in
those elements.  Their contents are not unique enough to be useful,
but nearly all other elements' contents are.

> Thanks, Mike.
>

Glad to help, if it does. ;)

> --Ray
>

--
Mike Rylander
[log in to unmask]
GPLS -- PINES Development
Database Developer
http://open-ils.org