What a job Jodi Allison-Bunnell did on her summary report of EAD search & retrieval systems. I for one, appreciate all your work on this and can readily use the information you provided. I can't thank you enough.
Anna Quan Leon
Scottsdale Public Libraries
Scottsdale, Arizona
________________________________
From: Encoded Archival Description List on behalf of Jodi Allison-Bunnell
Sent: Thu 6/24/2010 3:12 PM
To: [log in to unmask]
Subject: Summary: EAD search and retrieval systems
Thanks very much to the many people who responded to my June 10 query to this list on search and retrieval systems for EAD. Based on those responses, my previous survey of other EAD consortia, and Lisa Spiro's wiki on archival software, here's what I found. The solutions are listed more or less in order of how often they are used in the US for EAD sites.
I hope that others find this helpful and that I have represented the replies I received accurately; please let me know if I am in error. Thanks again for all the responses. If anyone wants to weigh in with further evaluative comments, that would be most welcome.
Best, Jodi
Ixiasoft's TEXTML
Type: Commercial
Annual cost: About $2500/year based on current size of NWDA database (5,500 documents)
Underlying: Native XML database,
Who uses it: NWDA (current), RMOA. A2A used to use it. It has been a fine product; support is good when we need it. Used by a number of libraries, mostly for other uses than EAD databases. Otherwise widely used by many other industries.
Notes: Native XML database with search engine. Does not offer cloud hosting.
URL: http://www.ixiasoft.com/
Mark Logic XML Database
Type: Commercial
Annual cost: Unknown. A free version limits how much data you can put in the database. Comments suggest that the commercial version is quite expensive.
Underlying: Native XML, MarkLogic. Customizing the code requires a knowledge of programming in XQuery.
Who uses it: University of Chicago, American Institute of Physics (for publishing, not EAD), Elsevier, Greenwood Publishing, McGraw-Hill Education, Oxford University Press, Princeton Theological Seminary (for digital library), University of Toronto Library (for digital library), Library of Congress' new system for delivering its EAD finding aids (in development; coming in a few months; will contain EADs, MARC records, and American Memory descriptions converted to XML)
Notes: MarkLogic uses XQuery, which supports a feature called "collection." Through the collection tag, different collections and archives can be defined, thus enabling the creation of a multi-institutional repository. Users can search the whole database or particular collections. The front end can be built on any platform and can be displayed in any way the archives want. The University of Chicago took this approach because their UNCAP project is multi-institutional and could be multiconsortial. Such an architecture will give participants the flexibility to create unique interfaces for different collections and projects. Chicago's code will be available to anyone who asks. Archives that want to use the software will need MarkLogic. MarkLogic offers cloud hosting as part of its services. Very active user group mailing list, and a user-group conference once a year.
URL: http://www.marklogic.com/ <http://www.marklogic.com/>
XTF
Type: Open source (developed by CDL)
Underlying: Java and XSLT 2.0 that indexes, queries, and displays digital objects, runs in Apache Tomcat.
Who uses it: CDL, Arizona Archives Online, Ohio Archives Online, SUNY Buffalo/Music Library,
Notes: Supports searching across collections of heterogeneous data, very configurable. Has an active users group with 144 members (http://xtf.sourceforge.net <http://xtf.sourceforge.net/> ). A number of people have commented that it requires the least amount of work to get it running. One person comments that it is quick to get up and running, less powerful than Cocoon/Solr, and more able to be customized for search/display.
URL: http://www.cdlib.org/services/publishing/tools/xtf/index.html
DLXS
Type: Commercial
Cost: One-time cost of $15,000 for XPAT license, annual membership $5000. Only sold to other educational institutions and non-profits.
Underlying: UNIX, Apache web server, Perl and other open-source utilities
Who uses it: National Library of Medicine/History of Medicine Division, University of Michigan/Special Collections-Bentley HL-Clements Library, others listed at http://www.dlxs.org/about/contacts.html.
Notes: Several note that it's not easy to set up, but they have great customer support. Developed by UMichigan's Digital Library Production Services but available for purchase by others. Offer hosting services, support, user group, listserv.
URL: http://www.dlxs.org/
Cocoon and Solr
Type: Open source
Underlying: This is the search platform for the Apache Lucene project. Written in Java, runs in Apache Tomcat.
Who uses it: RIAMCO, University of Virginia/coin website, Kittredge Collection
Notes: Ethan Gruber (UVA) says: Cocoon is a pipelining system that pairs a datastream, id1234.xml, with an XSLT stylesheet, and renders an HTML file (among other things it serializes), and Cocoon can interact with files on the filesystem or accessed through a web service. You should be able to use your existing XSLT stylesheets with little or no modification in order to render an HTML file from one of your NWDA finding aids in Cocoon. It's a different application architecture than the textml system, but to the user, they would not notice any difference. Solr is also a web service that accepts query parameters passed by the user and returns results in XML. Cocoon can take these search results, pair them with a stylesheet, and render a search results page into HTML. Solr is essentially the glue that binds your individual finding aids together into a cohesive collection that can be searched, browsed, or sorted in various ways.
URL: http://lucene.apache.org/solr/
Verity
Type: Commercial
Annual price: Unknown.
Underlying: Search engine only
Who uses it: American Institute for Physics, others listed at http://www.ultraseek.com/case_studies/powered.html. Educational/library users, but also high tech, government, business.
Notes: Seems to be most commonly implemented as a metasearch tool
URL: http://www.ultraseek.com/
Cheshire
Type: Open source
Underlying: XML search engine, written in Python and C. Addresses most standards, including SRW, SRU, CQL, Z39.50 and OAI.
Who uses it: British Library ISTC, Archives Hub, SHAMAN Digital Preservation Project
Notes: Cheshire3 for Archives available at http://www.cheshire3.org/download/ead/. User forum available. Developed in partnership between UC Berkeley and the University of Liverpool.
URL: http://www.cheshire3.org/
ExLibris DigiTool
Type: Commercial
Price: Unknown
Underlying:
Who uses it: Florida statewide EAD group
Notes: Digital asset management system, used for special collections, theses and dissertations, course materials. Support center, etc.
URL: http://www.exlibrisgroup.com/category/DigiToolOverview <http://www.exlibrisgroup.com/category/DigiToolOverview>
Cocoon and Lucene
Type: Open source
Underlying: Apache; Cocoon is for publishing XML, Lucene is the search engine
Who uses it: Five College Archives & Manuscript Collections
Notes: Active support groups.
URL: http://www.apache.org/
PLEADE
Type: Open source
Underlying: Based on the SDX platform
Who uses it: French National Archives (as of 2003)
Notes: Article available at http://www.digicult.info/downloads/dc_info_issue6_december_20031.pdf
URL: http://www.pleade.org/en/index.html
eXist
Type: Open source
Underlying: Java, Native XML, uses XQuery processing. A web server and Cocoon included in the distribution, but can run without them.
Who uses it: Columbia University Libraries
Notes:
URL: http://exist-db.org/
Archon
Type: Open source
Underlying:
Who uses it: list at http://www.archon.org/implementors.php <http://www.archon.org/implementors.php> (how up to date?)
Notes: Records descriptive information about collections and digital objects and provides a way to view, search, and browse that information. Low-cost, easily implemented turnkey solution for archives. Search and retrieval is very basic. Not native EAD/XML, but EAD is an output option. In process of merging with the Archivist's Toolkit.
URL: http://www.archon.org/
ARCHI-LOG
Type: Commercial
Cost: Unknown.
Underlying: Multi-user MS Windows environment, uses Visual Foxpro database engine.
Who uses it: Canadian archivists.
Notes: This works with the canadian RAD standard and can export its data in a EAD/xml file. The software can also make use of the EAD 2002 Cookbook to also output a choice of html files from the xml export. It can output a variety of finding aids in MS-Word or in HTML. There is also a fast search engine. Designed for use by archives, historical societies, other heritage organizations.
URL: https://www.infoka.com/archilog/en/archilog-e.htm
Jodi Allison-Bunnell
Program Manager, Northwest Digital Archives
Orbis Cascade Alliance
418 Woodford
Missoula, MT 59801
[log in to unmask]
(406) 829-6528
fax (860) 540-8281
Researcher website:http://nwda.wsulibs.wsu.edu/
Member website: http://orbiscascade.org/index/northwest-digital-archives
|