LISTSERV mailing list manager LISTSERV 16.0

Help for PIG Archives


PIG Archives

PIG Archives


PIG@LISTSERV.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

PIG Home

PIG Home

PIG  September 2016

PIG September 2016

Subject:

Re: How to document large image sequences in Premis v3

From:

Kieran O Leary <[log in to unmask]>

Reply-To:

PREMIS Implementors Group Forum <[log in to unmask]>

Date:

Fri, 16 Sep 2016 18:46:39 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (187 lines)

Hello,

Some updates since my last email. We are finding PREMIS to be
incredibly helpful with documenting Preservation Metadata. It is also
a very helpful guide in terms of what should be documented, and how
objects link to one another.

I'd like to provide our current use case, followed by some
observations on my previous questions in my initial post.

Firstly, here are two basic events involved in making a digital
representation of a film:

Event 1. 35mm combined optical print is scanned to 16-bit TIFF,
overscanned to include the combined optical track and perforations.
Event 2. Using AEO-Light, a PCM/WAV file is extracted from the TIFFS.
Some scanners such as the Blackmagic Cintel use DaVinci Resolve to
perform this action.

As our preferred post production software only uses DPX, the following
basic events are involved in creating a restored version:

Event A. The TIFFS created in Event 1 are transcoded losslessly to a
new 16-bit DPX sequence via ffmpeg. AVID only allows for a lossy
import of the TIFFS, so working with DPX is preferable.
Event B. PCM/WAV created in Event 2 is restored in RX5/ProTools
creating a new PCM/WAV file.
Event C. Colour correction and cropping occurs in AVID/Baselight.
Event D. A new DPX sequence and seperate WAV is exported and the
16-bit DPX created in step A is deleted.
Event E. In the future, the  DPX+WAV may be converted into a single
FFV1 image stream in a Matroska container

It's an awkward workflow, but our hands are tied. Our 12-bit scanner
only has 10-bit options for DPX, so in order to get all 12-bits, we
need to use the 16-bit TIFF option.

So we create two Representations of the one Intellectual Entity that
we intend to preserve :
Representation 1 (Events 1 and 2) = Untouched TIFFS straight from the
scanner, and the seperate AEO-Light WAV file, which may be converted
to FFV1/Matroska.
Representation 2 (Events A to E) = Graded/Corrected DPX sequence and a
seperate corrected PCM/WAV file, which may be converted to
FFV1/Matroska.

Here are some updates on each of my 4 questions in the previous email,
along with three new questions:

1. After looking more at the definitions of Representations in the
PREMIS data dictionary, there seems to be no question that the
Representation must be the TIFFS and the WAVS, as the audio and image
is required for a complete rendition of the Intellectual Entity. In my
current work in progress PREMIS generation script, there is a single
Representation object followed by individual file objects for each
TIFF and the WAV.


2. This issue is still seems relevant to me. It would seem that some
objectCharacteristic information for a representation would be
valuable to have, especially overall filesize. The note on fixity
information with regards to Representations on page 59 is interesting.
It says that this information should be recorded on a file level, as
the information is relating to individual files. However, storage
information is applicable to representations in PREMIS, but should it
also be said that the storage relates to files, rather than
representations?

3. In our PREMIS implementation meetings in the IFI, we haven't
discussed Events and Agents in as much detail as objects as of yet. I
am still curious about the best way to link agents and events to
objects. Some events will include: Creation, Message Digest
Calculation, Fixity, Deletion, Compression etc. It would seem to be
most convenient for the linkingObjectIdentifier to link to the
Representation. There are some occasions where an event only relates
to either the image sequence or the seperate WAV, but not both. In
this case, it would appear best to link on a File level, rather than a
Representation level.  Only transcoding the image sequence, but not
the WAV would also appear to require file level documentation, so
would this require one event, with  linking identifiers to each
file(anywhere from 500 to 150,000 files)? I'm not sure how else to do
it. Multiple fixity checks over time would have a massive amount of
documentation if recorded on a file level.

4. I think that some sort of fixityExtension could be helpful, but I
currently just record checksums for each item in the 1.5.2 fixity
semantic unit and keep a seperate manifest file. Perhaps this is
already covered in eventOutcomeDetailExtension.

5. I have a new question with regards to the PREMIS v3 documentation.
There is a very useful map on page 9 displaying the relationships that
objects can have with each other. Looking at that map, there is no
arrow pointing from Representation to File. However in the example of
relationSubType(1.13.2) on page 120, it looks like a Representation
can have a 'has root' relationship with the first file in a sequence.
I would assume that the first file could have a reciprocal 'is root
of' relationship. Am I correct in thinking that there is a
contradiction, or is that map just a visualisation tool showing some
of the possible relationships?

6. Another question with regards to documenting complex process
histories. In our case, if we want to document every step, we will
need to record information about objects that will not actually be
preserved. For example, the graded/restored file that ultimately gets
sent to preservation storage has been through several events that
result in the creation and deletion of new objects. Going back to the
previously described workflow involved in restoring the captured
TIFFS, the objects created in events A and D are deleted, and their
derivatives make their way to preservation storage instead. I am
thinking that it makes sense to retain information about these deleted
objects on our database. The preserved objects can link back to them
so that we can get an unbroken sense of the process history involved
in going from film to restored DPX/FFV1. Does this make sense or is
there some other way to document this?

7.  A question on relationships: The WAV is created from the TIFF
sequence, so it has a structural relationship to the TIFFS, and
possibly a derivation relationship.  It is difficult to map the
structural relationship of the WAV to the TIFF when using the
recommended LOC relationshipSubTypes. In a sense, they are siblings,
as they are both ultimately derived from the same source film. The WAV
is actually created from the TIFF, so in this sense, the WAV also
appears to have a 'hasSource' relationship to the TIFFs.There are
several ways to document all this, but I wonder if some of them end up
contradicting the concept of what a representation as defined by
PREMIS is. I know that this is a niche use case but perhaps there are
similar examples in other fields that could shed some light on our
issue?

I look forward to discussing all this with the PREMIS community. It
would be great if anyone who is already documenting image sequences
via PREMIS could post to the thread as well.

Best regards,

Kieran O'Leary
IFI Irish Film Archive

P.S - I thought I'd include my initial email underneath...

>
Hello,

I am investigating/experimenting with  PREMIS and I am trying to
automatically generate xml documents as items pass through our
workflows via python scripts. I work in the Irish Film Archive, so we
generally handle self-digitised moving image material as well as born
digital files. I hope that you can help me with some questions.

As we handle very large image sequences (approx 150,000 TIFF/DPX files
per film), I'm curious as to how to document these in PREMIS.

1. My main question is if these sequences need to be documented as a
representation object for the whole image sequence, and then perhaps
each image in the sequence requires their own file object? This would
lead to a gigantic xml file, but  I see the value in recording this
information on a file level. I notice that something similar happens
in your 'Animal Antics' example in the v3 documentation.  Are there
any examples available of an image sequence documented like this? On
our regular database,  we would view the whole sequence as one
object/package, and it would have one database record per sequence.

2. I also notice that objectCharacteristics is not applicable to
Representations, so I'm not sure how to document the overall file size
of the image sequence?

3. As for events, environments, agents, It would seem to make sense to
link all these to the single Representation object. I'd hate the
thought of having linking identifiers for all 150k files to a
'capture' event, or even multiple fixity check events over time.
Hopefully linking such events to the Representation object is
sufficient?

4. Initially I was wondering how to document fixity, as it makes most
sense to me to  just include a separate checksum manifest within the
SIP/AIP.  There does not appear to be a method within PREMIS to point
to an external file like this for fixity, such as 'fixityExtension'? I
suppose that this is only an issue when documenting a representation
object that contains multiple files, rather than documenting fixity
for a single file.

Any help on one or all of the questions would be greatly appreciated.

Kindest regards,

Kieran O'Leary.

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2021
June 2021
April 2021
March 2021
January 2021
December 2020
September 2020
August 2020
July 2020
June 2020
April 2020
February 2020
December 2019
November 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager