Reply from Joachim forwarded due to mail server problem. - Jukka Kervinen / National Library of Finland.
----------------------------
From: Joachim Bauer [mailto:[log in to unmask]]
Sent: 27. heinäkuuta 2016 15:42
To: Metadata Encoding and Transmission Standard
<[log in to unmask]>
Subject: RE: structMap and complex items
Hi Stuart,
in the METAe project a METS profile was worked out which includes a physical structmap to represent the page order and a logical structmap to represent the logical sequence (reading order, footnote handling, content spread over more than one "physical" page).
Rather than structmaps we use multiple file file groups to point to the referenced files of the objects (like links to epub, PDF and other derivatives, which even have overlapping / same content). These IDs of the file pointers can then be used within the structmaps at the desired levels to point to parts of the document or even to the full document representative file, like a multi-page PDF, containing the full content of the item.
Here a sample snippet explaining the referencing from the parallel structmaps. Find the PDF ID "PDF00001" as sample for the different reference level of a whole document vs. the page level of the other files.
<met>
....
<fileSec>
<fileGrp ID="IMGGRP" USE="Images">
<file ID="IMG00001" CREATED="2010-08-13T10:59:02" ADMID="IMGPARAM00001" MIMETYPE="image/jp2" SEQ="1" CHECKSUM="d901e7736284261fde3cb0af2594af23" CHECKSUMTYPE="MD5" SIZE="17460979">
<FLocat LOCTYPE="URL" xlink:href="file://./master/DDD_010759254_001.jp2"/>
</file>
<file ID="IMG00002" CREATED="2010-08-13T11:00:45" ADMID="IMGPARAM00002" MIMETYPE="image/jp2" SEQ="2" CHECKSUM="0d2e4a090ba376d542b7f50c49a0a049" CHECKSUMTYPE="MD5" SIZE="16454853">
<FLocat LOCTYPE="URL" xlink:href="file://./master/DDD_010759254_002.jp2"/>
</file>
</fileGrp>
<fileGrp ID="ALTOGRP" USE="Text">
<file ID="ALTO0001" CREATED="2010-08-13T11:00:57" MIMETYPE="text/xml" CHECKSUM="d89476e739b0a6d2c759e2c9c965d043" CHECKSUMTYPE="MD5" SIZE="1213513">
<FLocat LOCTYPE="URL" xlink:href="file://./alto/DDD_010759254_001_alto.xml"/>
</file>
<file ID="ALTO0002" CREATED="2010-08-13T11:00:58" MIMETYPE="text/xml" CHECKSUM="a1fc7665de584b78fcab8ef9c6a47492" CHECKSUMTYPE="MD5" SIZE="1295178">
<FLocat LOCTYPE="URL" xlink:href="file://./alto/DDD_010759254_002_alto.xml"/>
</file>
</fileGrp>
<fileGrp ID="TECHMDGRP" USE="Technical Metadata">
<file ID="TMD00001" CREATED="2010-08-13T11:02:01" MIMETYPE="text/xml" SEQ="1" CHECKSUM="8029823c6b35ea840b67eca03e9b788a" CHECKSUMTYPE="MD5" SIZE="3666">
<FLocat LOCTYPE="URL" xlink:href="file://./technicalmetadata/DDD_010759254_001_techmeta.xml"/>
</file>
<file ID="TMD00002" CREATED="2010-08-13T11:02:01" MIMETYPE="text/xml" SEQ="2" CHECKSUM="d7a2c1334cba540ed68a57b4561fc536" CHECKSUMTYPE="MD5" SIZE="3666">
<FLocat LOCTYPE="URL" xlink:href="file://./technicalmetadata/DDD_010759254_002_techmeta.xml"/>
</file>
</fileGrp>
<fileGrp ID="PDFGRP" USE="access">
<file ID="PDF00001" CREATED="2010-08-13T11:02:00" MIMETYPE="text/pdf" SEQ="1" CHECKSUM="f60cae4377f4c81d34c2878a174e12a8" CHECKSUMTYPE="MD5" SIZE="1666394">
<FLocat LOCTYPE="URL" xlink:href="file://./pdf/DDD_010759254.pdf"/>
</file>
</fileGrp>
</fileSec>
<structMap LABEL="Physical Structure" TYPE="PHYSICAL">
<div ID="DIVP1" DMDID="DCMD_ELEC DCMD_PRINT" LABEL="Unknown" TYPE="Newspaper">
<fptr>
<area BETYPE="IDREF" FILEID="PDF00001"/>
</fptr>
<div ID="DIVP2" ORDER="1" ORDERLABEL="1" TYPE="PAGE">
<fptr>
<par>
<area FILEID="IMG00001"/>
<area FILEID="ALTO0001" BETYPE="IDREF" BEGIN="P1"/>
</par>
</fptr>
</div>
<div ID="DIVP3" ORDER="2" ORDERLABEL="2" TYPE="PAGE">
<fptr>
<par>
<area FILEID="IMG00002"/>
<area FILEID="ALTO0002" BETYPE="IDREF" BEGIN="P2"/>
</par>
</fptr>
</div>
</div>
</structMap>
<structMap LABEL="Logical Structure" TYPE="LOGICAL">
<div ID="DIVL1" TYPE="Newspaper" LABEL="Nieuwsblad van Friesland : Hepkema's courant no. 3 11.01.1943">
<div ID="DIVL2" TYPE="VOLUME" DMDID="DCMD_ELEC DCMD_PRINT" LABEL="Nieuwsblad van Friesland : Hepkema's courant no. 3 11.01.1943">
<div ID="DIVL3" TYPE="ISSUE" DMDID="DCMD_ISSUE1" LABEL="Nieuwsblad van Friesland : Hepkema's courant no. 3 11.01.1943">
<fptr>
<area BETYPE="IDREF" FILEID="PDF00001"/>
</fptr>
<div ID="DIVL4" TYPE="TITLE_SECTION">
<div ID="DIVL5" TYPE="TEXTBLOCK" ORDER="1">
<fptr>
<area BETYPE="IDREF" FILEID="ALTO0001" BEGIN="P1_TB00001"/>
</fptr>
</div>
<div ID="DIVL6" TYPE="TEXTBLOCK" ORDER="2">
<fptr>
<area BETYPE="IDREF" FILEID="ALTO0001" BEGIN="P1_TB00002"/>
</fptr>
...
</mets>
More information about how we handle physical and logical structmaps in docWorks can be found here:
http://www.loc.gov/standards/mets/presentations/METS-Workshop_dWProfile_2014-09-Jo.pptx
Finally:
We implemented a METS profile for Rosetta for the National Library of Israel in 2014 and created a conversion for them from their former METS profile.
We might be able to provide you with a sample METS file but we have to check with NLI first if it is fine with them.
Kind regards,
Jo
Joachim Bauer
Senior System Engineer, CCS Content Conversion Specialists GmbH
> -----Original Message-----
> From: Metadata Encoding and Transmission Standard
> [mailto:[log in to unmask]] On Behalf Of Stuart Yeates
> Sent: 18. heinäkuuta 2016 1:20
> To: [log in to unmask]
> Subject: [METS] structMap and complex items
>
> We are building some complex SIPs for ingestion in a METS-based repository
> and I have some questions about what the structMap should look like. We
> are archiving both digitised print works and a long-running website on which
> they have been previously hosted.
>
> The items include multiple representations of the content, monolithic
> representations (ePub, PDF and TEI/XML), piece-wise logical representations
> (HTML) and piece-wise physical representations (page images). Generating
> most of the METS seems pretty straight-forward, but I'm struggling with the
> structMap.
>
> In particular I'm struggling with the question of whether I should:
>
> * use a single stuctMap with each of the monolithic representations plus
> detailed breakdowns of the piece-wise representations (in which case the
> question is whether this is a physical or a logical structMap?);
>
> * use a stuctMap per representation (in which case the question is how I
> indicate that these are parallel representations of the same intellectual
> content); or
>
> * use a top-level stuctMap to describe each of the representations and then
> separate stuctMaps for each piece-wise representation (in which case the
> question is whether I need to completely enumerate the structMap or can I
> include them reference)?
>
> Thoughts?
>
> I would really appreciate pointers to sample METS files with multiple
> representations of the same content such as these.
>
> The target repository is Roestta, but we're at least as interested in
> correctness as compatibility with the current solution.
>
> The website is http://nzetc.victoria.ac.nz/ / formerly http://www.nzetc.org/
>
> cheers
> stuart
>
> --
> I have a new phone number: 04 463 5692
> https://www.facebook.com/VUWLibrary /
> https://www.facebook.com/TKMPC
|