Print

Print


I would love to store all this stuff with the open-oni github organization:

https://github.com/open-oni/

We could start another repo to collect scripts. I'd be happy to start a new
repo that others could push to. Any thoughts on organization? If we create
a repo for scripts, it will get it's own wiki, and you could use that to
start documentation. Or just to link to documentation elsewhere.

(for those wondering what open-oni is, see Karen Estlund's June 30 email to
this group about the meeting where we created this new github organization
and fork of the chronam software.)


----------------------------------------------------------
Karin Dalziel
Digital Design/Development Specialist
Center for Digital Research in the Humanities, University of
Nebraska-Lincoln
[log in to unmask]
402-472-4547

On Tue, Aug 4, 2015 at 9:38 AM, Michael Bolton <[log in to unmask]>
wrote:

> Stephanie,
>
> Thanks for the update!
>
> I now have two instances of institutions using locally developed scripts
> and procedures for ingesting local content.  I think we may have an
> opportunity here.  I am thinking we can build on these local scripts and
> come up with a package, of sorts, that helps prepare batches for ingest.
>
> I have been researching this for a while and the information on the
> Guidelines and Resources page at LOC has been very helpful (
> http://www.loc.gov/ndnp/guidelines/ ).  I also have been reading
> "Guidelines for Digital Newspaper Preservation Readiness" by Katherine
> Skinner and Matt Schultz ( downloaded from Educopia Institute
> http://educopia.org/publications/gdnpr ).  I like the way the document
> lays out the digitization process from identifying and inventorying content
> all the way to packaging.  They make a number of recommendations and
> suggestions as well as pointing out tools that would help at each stage of
> the process.  As a starting point, I would suggest we use the paper as a
> guide for developing a workflow.
>
> Stephanie, I would be interested in seeing a sample of the spreadsheet you
> use to prepare the batches. Using spreadsheets seems to be a common way of
> collecting metadata. I would also be interested in seeing how you convert
> that to METS files.  I have a copy of the process used at UO and am
> reviewing it now.  I think they use XML files to prepare the batches.  I
> will followup on that.
>
> If we think it will help, I will also see about starting a Google Doc to
> keep up with what we find.
>
> And thanks again for volunteering.  I think this is going to be a fun
> project.
>
>
>
> On Mon, Aug 3, 2015 at 10:27 AM, Williams, Stephanie <
> [log in to unmask]> wrote:
>
>> Hi, Michael!
>>
>> We're in the same boat here in NC, I think--we're not NDNP awardees, but
>> we're creating batches on our own, according to NDNP standards.
>>
>> We do have some scripts to help us with this process, but (fortunately or
>> unfortunately, depending on your perspective*) it all starts with
>> batch-level spreadsheets. These serve as the base for generating METS
>> files, issue-level directories, and batch manifests.  We've never done any
>> updating of the Chronam MySQL by hand, because we're still letting Chronam
>> pull in MARC data/populate its own lists.  This isn't without problems, but
>> it's ok. The one major change we've made there is to use a Worldcat API as
>> the source of MARC data instead of chroniclingamerica.loc.gov--most of our
>> newspapers fall outside of NDNP selection guidelines and aren't represented
>> there. For items without LCCNs (student newspapers, small community papers,
>> corporate papers) we assign them: we're lucky to have a very helpful
>> cataloging department one floor up.
>>
>> If hearing any more about our process sounds like it might be helpful,
>> please contact me--we'd be happy to talk.
>>
>> Thanks, and good luck,
>>
>> Stephanie Williams
>> North Carolina Digital Heritage Center
>> http://www.digitalnc.org
>> [log in to unmask]
>>
>> *It works for us. It's time-intensive, but we've been experimenting with
>> tools to help us generate page-level data while we scan, which is a huge
>> help. We preserve the spreadsheets alongside the end-result batches; when
>> changes are made, we make them in the spreadsheets and regenerate the
>> METS/manifests rather than edit by hand.
>>
>> ------------------------------
>> *From:* Data, API, website, and code of the Chronicling America website [
>> [log in to unmask]] on behalf of Michael Bolton [
>> [log in to unmask]]
>> *Sent:* Monday, August 03, 2015 10:49 AM
>> *To:* [log in to unmask]
>> *Subject:* Deplolying Chronam for local holdings
>>
>> Hello All,
>>
>> The Texas A&M University Libraries is working on a project to digitize
>> our campus newspapers and we believe Chronam would be a great system for
>> viewing and managing the collection.  We have the viewer installed and have
>> ingested a couple of sample batches and the system appears to be working
>> very well.  We would now like to start adding our local content.
>>
>> We are looking for some guidance on how to prepare batches for a local
>> ingest, that is, a non-NDNP submission as I have learned its called.  I
>> would be interested in hearing how other institutions prepare their batches
>> and just what is required for an ingest of a batch.  All our experience has
>> been with sample batches downloaded from LOC.  We have been using the
>> technical guidelines for the NDNP project as a roadmap and those have been
>> very helpful.
>>
>> We are starting with TIFFs and based on the information from the
>> guidelines, we are creating the compressed JPEG2000 files as well as the
>> OCR files.  If there are scripts or programs that help with this process,
>> such as appending the metadata to the JP2 files or creating the METS files,
>> I would be happy to hear about them.  I also believe we probably need to
>> update the MySQL database with information for our site, possibly the
>> "titles" table.  The folks at the University of Oregon Libraries have been
>> very helpful and they suggested I post to this list for any additional
>> information.
>>
>> Thanks.
>>
>> --
>> Michael W. Bolton  |  Assistant Dean, Digital Initiatives
>> Sterling C. Evans Library  |  Texas A&M University
>> 5000 TAMU  |  College Station, TX  77843-5000
>> Ph: 979-845-5751  |  [log in to unmask]
>> http://library.tamu.edu
>>
>
>
>
> --
> Michael W. Bolton  |  Assistant Dean, Digital Initiatives
> Sterling C. Evans Library  |  Texas A&M University
> 5000 TAMU  |  College Station, TX  77843-5000
> Ph: 979-845-5751  |  [log in to unmask]
> http://library.tamu.edu
>