Hi Mark,


The website for the RIAMCO consortium is not yet live but to answer your


-          We're using Solr to index the EAD records.

-           Advanced search will offer researchers the option of
limiting by date but this field will only search collection dates (not
dates in the <dsc>).

-          We're not planning on using bulk dates in the searches.


In thinking about normalized dates we starting questioning why we were
using normalized in the components since we knew, with our current
search capabilities, we would not be searching those dates.  We decided
to keep normalized in the components for that "someday" when we would be
able to offer researchers enhanced search and retrieval functions in the


 - Jennifer


* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *  

Jennifer J. Betts

RIAMCO Project Manager
John Hay Library, Box A

Brown University

Providence, RI  02912

E-MAIL:  [log in to unmask] <mailto:[log in to unmask]> 

TEL:  (401) 863-2148

CELL: (401) 480-1173

FAX:  (401) 863-2093 

RIAMCO wiki:



From: Encoded Archival Description List [mailto:[log in to unmask]] On Behalf
Of Custer, Mark
Sent: Thursday, January 15, 2009 11:47 AM
To: [log in to unmask]
Subject: back-end systems for EAD, and other questions

Yesterday's post about "normalized dates" has me thinking once again
about how dates are used (or not used) in EAD records.  As far as I can
tell, RLG's ArchiveGrid doesn't permit searching by date (I could be
wrong on this, though, as I don't have full access to it, but it does
use Lucene to index its records; though I suppose that most of these
records are just MARC records?) and Proquest's Archive Finder does
permit searching by date, but it doesn't really allow you to do very
much (i.e. there's no way to rank your results by "relevancy").

This leads me to a question:  what sort of back-end systems are archives
using for their EAD records? (are there any surveys out there that has
this information, or should we start one???)

At ECU, we're using an XML database only, but we aren't doing any
advanced searching by date (primarily because, at this time, if you did
search for something like "1912", it's not going to limit your results
very much; and then, really, you're just back at the whole "browse by
collection name" situation).  However, you can do a keyword search for
"1912", and the results that are returned to you will be ordered by the
number of hits in each document, which, in my mind, is only a small
difference in functionality, but perhaps more useful (in most occasions)
than simply limiting your results to any and all collection date ranges
that contain the year "1912".

This leads me to another set of questions:  is anyone out there using
the "bulk" attribute as part of your information retrieval process?...
is anyone using dates beyond the collection range (those dates
associated with a series, folder, even an item) in the information
retrieval process?...  has anyone attempted to test their corpus of EAD
records with their current search operations vs. indexing and searching
those records by means of different models of IR, such as Nutch
<> , INDRI
<> , Solr
<> , or even just Google Custom Search???

I think it's great that we're encoding our documents so well, but I keep
wondering if we're harnessing that information in the best possible ways
yet (and perhaps the best solutions won't be tied to our encoding
practices at all).


Mark Custer

Text & Markup Coordinator

ECU Digital Collections