Print

Print


Hi Mark,

I have done it once, for a not too sophisticated, but quite large EAD set, 
and
for Drupal as interface. Steps were taken:

1) created a flat XML from original EAD, conforming to Solr input format
important sub steps:
a) preserving parent-child content with record ID, and "parent" field 
(c01...c12 levels)
b) preserving full path with XPATH expressions 
(rootID/childID/grandchildID/.../currentDocID
c) handling dates to Solr format

2) load it into Solr
3) writing simple methods, which could handle
a) navigation accross hierarchy
b) searching dates (and other fields, but those are trivials)
c) showing full path

That was all I done.

P├ęter

----- Original Message ----- 
From: "Mark A. Matienzo" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, April 15, 2010 5:32 PM
Subject: Indexing EAD using Solr


>I know there has been some discussion related to this about making EAD
> available as part of the discovery layer, but I'm interested in
> getting a sense of which institutions are using Solr [0] to index EAD.
> At this point, I'm more interested in discussing the different
> indexing strategies from a technical standpoint rather than focusing
> too much on the discovery layer. For what it's worth, this discussion
> began [1] when some folks were talking about incorporating EAD into a
> Solr index to be used by Blacklight [2], an open source discovery
> layer.
>
> If your institution is using Solr to index EAD, can you briefly
> describe your indexing process? I would be interested in coordinating
> future work, or potentially developing a set of recommendations/best
> practices to share with the community.
>
> [0] http://lucene.apache.org/solr
> [1] 
> http://groups.google.com/group/blacklight-development/browse_thread/thread/848bae32b11a8501
> [2] http://projectblacklight.org/
>
> Mark A. Matienzo
> Digital Archivist, Manuscripts and Archives
> Yale University Library
>