Thanks Michele and Mark for your kind words about our Large Scale Digitization program!
The text below is from the announcement of the program, and provides a basic outline of its workflow and infrastructure. Our goal has been to keep both as simple as possible, and to make use of existing metadata, equipment, staff and expertise. Some additional details- Our finding aids are encoded with locally developed scripts and templates. Links to the digital images are added manually to already existing guides. (There is a link to the XML in the upper right corner of each finding aid.) The finding aids are delivered via a system developed in-house and based on MarkLogic.
The Large Scale Digitization Project is a collaborative effort involving the Special Collections Research, the Preservation Department, and the Digital Library Development Center. The project has been guided by definitions of, and requirements for, mass digitization provided by funding agencies such as the National Historical Publications and Records Commission. These guidelines stress expedited scanning workflows, without sacrifice of image quality, and with close attention to preservation concerns, and the use of existing descriptive metadata, such as that provided by a finding aid.
The collections are scanned by Preservation staff. The documents are scanned in color, in the order in which they are filed in each folder, and a TIFF file is created for each page image. A naming scheme (incorporating a unique collection identifier, box and folder number) is used for the files which can be extended to other collections scanned as part of the project. The TIFF images from each folder in the physical collection are combined into PDFs for delivery. PDF was chosen as a delivery format because of its simplicity, stability and ubiquity. It is expected that the vast majority of users will have PDF viewers on their computers, and will be able to use them to enlarge, decrease, rotate, print, and otherwise easily view the images. Although the images are delivered as PDFs, the TIFFs of each page will be stored in the digital repository, and will be available if needed for other purposes.
Links to the digital files are added to the online finding aid by SCRC staff. The Encoded Archival Description (EAD) tags chosen allow links to be created at any level of description in the finding aid, from series, to folder, to item, and for multiple links to be attached to a particular description. DLDC staff updated the style sheets to allow display of the links in the finding aids database. DLDC is also hosting the digital files, which will be retained in and delivered from the digital repository.
Archivist for Processing and Digital Access
Special Collections Research Center
University of Chicago Library
Chicago is great. Would like to know what the technical underpinnings are for this.