Hello METS community!

 

I’ve been pondering how to shape our Mets structMaps when we want to give our users access to multiple representations of the same work.  I’m focused on the problem of delivering files/representations of parts of the work (eg. page images) vs. the entire work (eg. pdf of the entire work.) For example, we might have the following mime types/ representations for a single work:

1.       Master tiffs at the page level

2.       Jpeg 2000 representation at the page level

3.       Jpg at the page level

4.       Pdf at the page level

5.       Pdf of the entire work

6.       Epub version

 

It seems to make perfect sense to handle all of the different files/representations that have parallel content in a single structMap.  Referring to the numbered list above, that would mean that we group 1-4 together  in a single div.

 

We are currently doing this in our Digitool implementation using structMap divs that look like this:

                                                <mets:div TYPE="image" LABEL="bh007462" ORDER="4">
                                                           
<mets:fptr FILEID="tif00002"/>
                                                           
<mets:fptr FILEID="jp200002"/>
                                                           
<mets:fptr FILEID="jpg00002"/>
                                               
</mets:div>

 

The presentation handles this well, allowing us to hide the master tiff and providing “view option” icons for the jpg and j2k above the METS viewer: http://dcollections.bc.edu:80/webclient/DeliveryManager?application=DIGITOOL-3&pid=125467 .  The reason for the two “views” is that the jpeg2000 is of higher quality but is protected from downloading by the j2k viewer.  The jpg is lower res and can be downloaded.

 

My concern is how to structure mets to handle representations that are not exactly parallel to the page images, such as an epub or a pdf of the entire work.

 

It seems to me there are three options

1.       Include the pdf for the entire work in the same structMap as the page images.  If you look at the Brown’s “Brown University "Page Turner" Periodical Issue with PDF “ example on the LOC METS site, a single struct map presents a 2nd level div that holds the pdf for the entire work and 2nd second level div that holds the individual pages.  See http://dl.lib.brown.edu/metsrecords/1165311452312500.xml This approach certainly works, and we have used it in our system.  I do see some drawbacks.  I think it is cleaner from a data structure to have a structMap proceed logically through an entire work one and only one times.  Here’s an example from our implementation where we use a single structMap that holds (1) pdfs for each individual articles in an issue as well as (2) a pdf for the entire issue: http://dcollections.bc.edu:80/webclient/DeliveryManager?application=DIGITOOL-3&pid=66263

 

2.       Exclude the pdf for the entire work from the METS container that represents page image version.  We’ve done this several times, using our system to link separate METS objects pdfs that represent (1) entire issues and (2) individual pages.  I’m not crazy about this approach, as I’d like to use the METS to wrap up all representations of a work.  Here is how this looks in our system when we have separate METS for the full issues and the “pages” representation; http://dcollections.bc.edu/R/?func=collections-result&collection_id=1764

 

 

3.       Represent the pdf for the entire work and the pdf for the “pages” version in separate physical struct maps.  This is the approach that seems most appealing to me at this point.  You get all representations into a single METS and each structMap allows the user to navigate logically through an entire work.  It is somewhat like the internet archives approach to presenting multiple formats of a single book.  Here is an example of this approach in action: http://dcollections.bc.edu:80/webclient/DeliveryManager?application=DIGITOOL-3&pid=156297 .  The “current view” drop down list will take the user to a second physical structMap which contains the entire pdf.  I do like this approach the best.  It is a little complicated to think of the pages structMap containing multiple manifestations of each page (jpg, tif, j2k, tei, etc) and the entire work struct map containing an entirely different manifestation.  I guess it’s not so bad if you think of each structMap including representations with parallel physical structures.  And then have presentation software that works with this approach.

 

I’m wondering how others have handled this situation and which of the three approaches seems the most rational.

 

Thanks,

 

Betsy

 

 

Betsy McKelvey

Digital Collections Librarian

617-552-1989