Print

Print


Hi Morgan

Yes it is clear, thanks for explaining. I agree that ID/REFs, filenames
etc shouldn't carry any semantic meaning. An alternative to using child
divs in structMap to indicate the form of digitized content, might be to
use the fileGrp USE attribute in fileSec since the files are grouped
that way anyway and fptr points to them. But I haven't really thought
this through or discussed it with the rest of the team yet; I'm still
gathering information.

Bronwyn Lee
Business Analyst, Newspaper Digitisation Project
National Library of Australia
[log in to unmask]
 

-----Original Message-----
From: Metadata Encoding and Transmission Standard [mailto:[log in to unmask]]
On Behalf Of Morgan V. Cundiff
Sent: Tuesday, 6 February 2007 3:45 AM
To: [log in to unmask]
Subject: Re: [METS] LC METS Profile for Historical Newspapers [Draft]
structMap question

Hi Bronwyn,

I am glad you are taking a close look at our draft profile.

I appreciate your question (i.e. "In other words only have divs for
issue, page and pageRegion?" as it is something I wrestled with while
creating the draft.

First, I will say that I am glad that you accept (as far as I can
tell) the most important feature of the profile, i.e. that a newspaper
can be modeled with three physical entities, which are: issue, page and
pageRegion.

Second, I will say that yes, you probably could, in a given project, do
as you are suggesting, which is to do away with the  other div types
(news:image and news:alto). However, the rationale for the other div
types is this: basically, they create div subelements by type of content
for each of the three physical entity div types already named (issue,
page and pageRegion). By "type of content" I mean either the image
and/or alto files for a page, of the alto file for a page region. This
approach is a bit more verbose, as you point out, but I think the
advantage is that it makes it possible to read (and manipulate) the
structMap a little more easily. For instance, if one wanted to process
only the alto files at the pageRegion level, the suggested approach
makes that very easy to do. We try to avoid relying on ID/IDREF values,
filenames, etc to carry any semantic meaning (e.g. identifying alto
files by looking at FILEID="ALT00001_ex10".

Hope this is clear.

Morgan Cundiff


On Mon, 5 Feb 2007, Bronwyn Lee wrote:

> Hello all
> 
> The National Library of Australia is embarking on its Newspaper 
> Digitisation Project and I have been looking at the LC METS Profile 
> for Historical Newspapers [Draft] 
> (http://www.loc.gov/standards/mets/test/ndnp/00000010.html).
> 
> I have a question about part of example 10 under structMap Requirement
> 2:
> 
> <mets:structMap>
> -<mets:div TYPE="news:issue" DMDID="DMD_issue_ex10"> --<mets:div 
> TYPE="news:page"> ---<mets:div TYPE="news:image"> ----<mets:fptr 
> FILEID="IMG00001_ex10" /> ---</mets:div> ---<mets:div 
> TYPE="news:alto"> ----<mets:fptr FILEID="ALT00001_ex10" /> 
> ---</mets:div> --</mets:div> -</mets:div> </mets:structMap>
> 
> Are the nested divs for news:image and news:alto necessary? Is there a

> reason why you could not have:
> 
> <mets:structMap>
> -<mets:div TYPE="news:issue" DMDID="DMD_issue_ex10"> --<mets:div 
> TYPE="news:page"> ---<mets:fptr FILEID="IMG00001_ex10" /> 
> ---<mets:fptr FILEID="ALT00001_ex10" /> --</mets:div> -</mets:div> 
> </mets:structMap>
> 
> The reason I ask is that it seems a bit awkward when you add
pageRegion.
> For example the page might have two page regions and there are image 
> and alto files for the page as a whole as well as alto files for each 
> of the page regions, as below:
> 
> <mets:structMap>
> -<mets:div TYPE="news:issue" DMDID="DMD_issue_ex10"> --<mets:div 
> TYPE="news:page">
> 
> ---<mets:div TYPE="news:image">
> ----<mets:fptr FILEID="IMG00001_ex10" /> ---</mets:div> ---<mets:div 
> TYPE="news:alto"> ----<mets:fptr FILEID="ALT00001_ex10" /> 
> ---</mets:div> ---<mets:div TYPE="news:pageRegion" 
> DMDID="DMD_article01_ex10"> ----<mets:div TYPE="news:alto"> 
> -----<mets:fptr> ------<mets:area FILEID="ALT00001_ex11" 
> BEGIN="P1_TB00005" /> -----</mets:fptr> ----</mets:div> ---</mets:div>

> ---<mets:div TYPE="news:pageRegion" DMDID="DMD_article02_ex10"> 
> ----<mets:div TYPE="news:alto"> -----<mets:fptr> ------<mets:area 
> FILEID="ALT00001_ex11" BEGIN="P1_TB00024" /> -----</mets:fptr> 
> ----</mets:div> ---</mets:div>
> 
> --</mets:div>
> -</mets:div>
> </mets:structMap>
> 
> Here, news:image div for the page is at the same level in the 
> hierarchy as the news:pageRegion divs, which doesn't seem quite right.

> You could give the pageRegion divs an order attribute value of 1 and 2

> but the image div at the same level in the hierarchy wouldn't have an 
> order attribute. Wouldn't it be better and shorter to have:
> 
> <mets:structMap>
> -<mets:div TYPE="news:issue" DMDID="DMD_issue_ex10"> --<mets:div 
> TYPE="news:page"> ---<mets:fptr FILEID="IMG00001_ex10" /> 
> ---<mets:fptr FILEID="ALT00001_ex10" />
> 
> ---<mets:div TYPE="news:pageRegion" DMDID="DMD_article01_ex10"> 
> ----<mets:fptr> -----<mets:area FILEID="ALT00001_ex11" 
> BEGIN="P1_TB00005" /> ----</mets:fptr> ---</mets:div>
> 
> ---<mets:div TYPE="news:pageRegion" DMDID="DMD_article02_ex10"> 
> ----<mets:fptr> -----<mets:area FILEID="ALT00001_ex11" 
> BEGIN="P1_TB00024" /> ----</mets:fptr> ---</mets:div>
> 
> --</mets:div>
> -</mets:div>
> </mets:structMap>
> 
> In other words only have divs for issue, page and pageRegion?
> 
> 
> Bronwyn Lee
> Business Analyst, Newspaper Digitisation Project National Library of 
> Australia [log in to unmask]
> 
>