Your question about how we plan to create the finding aid is a good one.
We have your standard "Finding Aid" site at the University of
Maryland. <http://www.lib.umd.edu/archivesum>. For that photograph
collection, we definitely wanted a record in that system. Originally,
we were thinking "traditional" finding aid. Other options available to
us *right now* would be something like putting in a basic "abstract"
finding aid and linking out to a PDF or some other form of the Access
However, the bigger question we've been asking (perhaps just to
procrastinate? Although I like to think it's because we are trying to be
thorough... ;)) is when is the finding aid not enough? We have asked
this question, as well as "when is the finding aid appropriate?" We
have done a good job at UM getting people to understand that ArchivesUM
is where you go for archival finding aids, but what about our rare book
collections? People don't always understand that those are in the
catalog, and the question has come up asking if we couldn't put some of
our non-archival special collections into an EAD and include them in
ArchivesUM for discovery.
With the photograph collection, we have also asked ourselves if it might
not make sense instead to put the metadata for the folder descriptions
into our Fedora digital repository as discrete items. That would boost
our repository's size from a modest 10,000 or so records to about 75,000
records. The problem there is that we obviously don't have the entire
collection digitized, so would that be confusing to people. It seems
with this type of photograph collection, a true database, rather than an
XML file, might be a better form of discovery.
I think I would like both. With links between and levels of discovery
all over the place. And I don't think we're too far away from that, in
the scheme of things. All we need is some technical support and a will
Some other comments - I agree (I forget who mentioned this), the
creation of the EAD is not so difficult. With this particular photograph
collection, the information is already in a database, and we create our
finding aids by starting from a database, so making the actual XML file
is trivial. We could mount it online tomorrow. And, as I type this, I
am wondering why we haven't just gone ahead with a stop-gap measure and
used the abstract/PDF model to get started, instead of waiting for
everything to get perfect. The presentation is always the challenge.
Our system works great for 95% of our finding aids. It's just the
oddballs that keep us on our toes.
Also, another comment/question - we use Lucene to index our finding aids
. I forget what the limit is, but there apparently is a size limit.
We've known this since the beginning. So, with our very large finding
aids, a search from within our site is going to miss some of that stuff
in the depths. Maybe breaking down things into separate files, as Ethan
suggested, would be a way to get around this. Will have to experiment...
Jennie Levine Knies
Manager, Digital Collections
2216 Hornbake Library
University of Maryland
College Park, MD 20742
(301)314-2558 TEL (301)314-2709 FAX
[log in to unmask] E-MAIL
Ethan Gruber wrote, On 2/8/2010 3:30 PM:
> I have found that Saxon processes anything that is 5mb or under fairly
> efficiently, and load times aren't so bad as long as you're not on dialup.
> Your photograph collection in an Access database--do you plan on making
> a traditional type of EAD finding aid that will go into a collection of
> other finding aids and served through a typical type of finding aid
> website, or do you want to create a site that puts emphasis on the item
> level? I have done work on several projects where the focus is on
> item-level information. I am gotten around the issue of having a 10 mb
> finding aid by making each item as a standalone XML file that contains
> only a <c>. The <c>'s can be reassembled into a full finding aid, if
> necessary, but processing is only done on the small, singular XML file
> that has only several kilobytes of information that describes an item.
> I think dealing with massive finding aids is not such a big deal if you
> put aside the notion that all the data must reside in the same XML file
> at processing time. As long as you can extract all the data into a
> single XML file at the time of migration, it doesn't really matter how
> you store the files under normal circumstances.
> On Mon, Feb 8, 2010 at 3:10 PM, Wick, Ryan <[log in to unmask]
> <mailto:[log in to unmask]>> wrote:
> Our finding aid for the Ava Helen and Linus Pauling Papers is
> currently at 13.8MB of XML.
> From very early on I put each series into it's own XML file. They
> weren't intended to stand on their own so there was nothing "above"
> <c01>. There wasn't a specific link to them, and I just modified our
> stylesheet to pull them in where appropriate. Last year we switched
> to using XML's external entities referencing local files to "link"
> to the series and are happy with the results. See
> for more information on XML's entities.
> For web delivery, we have always split the display of the finding
> aid into smaller pieces. We generate static HTML files and divide
> the series and box listings into smaller chunks for ease of
> navigation and retrieval. There is also an option to view the entire
> series in one file. (The 17 series pages total about 16.4 MB of
> HTML. The hundreds of smaller pages combined would have a greater
> total, but most of that is overhead of duplicate navigation). The
> majority of our traffic comes from search engines, so we've tried
> our best to make our content easily indexable.
> On another note, in 2006 we published a print version of the Pauling
> Papers. This included some additional content but the entire package
> ended up being 1800 pages in 6 volumes. http://paulingcatalogue.org/
> Mark, thanks for posting about UNC's Hugh Morton collection, I
> wasn't aware of it before.
> Ryan Wick
> Information Technology Consultant
> Special Collections
> Oregon State University Libraries