This is response to Robin Wendler's and Dave Ackerman's email of May 16
regarding audio materials and the use of AES-developed audio decision
lists (ADLs) to add a layer of metadata when the Harvard team manages,
say, multi-track audio in "multi-file" form.
My short response to Harvard's interesting and complex proposal
is: whew! We are doing nothing like that and do not contemplate anything
like that in the near term. Our reformatting has been focused on
materials like folk music tapes, early radio broadcasts from transcription
discs, and modern lps. That is, simple items that are very nicely
represented in a single file, mono or stereo, and for which time offsets
to the "cut" level work like a champ. In our structmaps, each cut gets a
labeled <div>, followed by a trio of fptrs to the master, service-high,
and service-low files. Each fptr uses the "area" tag to indicate the
begin and extent of the time and, in our viewers, this time information is
handed off to instruct the audio player to jump to the start point.
Will we need to do what Harvard discusses? I can imagine an instance,
hypothetical: what if we acquired the collection of a modern recording
artist that included a 24-track tape from one of her recording sessions.
To preserve it, we might be forced to make 24 parallel files, together
with additional metadata (AES-ADL? SMIL? MPEG-7?). (Would we need playback
devices like digital audio workstations to actually play the multiple
files? ) And we might wrap this all in a METS document. But preserving
this content in this way would be to take on a daunting long-term
management burden, beyond the normal management required of a simpler METS
instance. In considering this added layer of complexity, I am reminded of
my earlier puzzlement regarding certain specialized data elements in audio
techMD, e.g., first_sample_offset, first_valid_byte_block,
audio_block_size, and others. These are part of the proposed AES audio
administrative metadata and have been echoed in our provisional audioMD
extension schema.
I know that I would have difficulty choosing a course of action if we
encountered either one of these hypotheticals in our shop, i.e., examples
of multi-track audio or of files in which we need to know where the valid
bytes are. We would ask: Shall we normalize the bitstreams in some way to
reduce the long-term management overhead? Or shall we keep exactly what
exists and rise to the challenge of tracking the extra layer or layers of
metadata, and the "meta-metadata" needed to render these extra layers
intelligible?
In considering long-term preservation, we have pondered and discussed (but
have not acted on) other instances in which normalization might be
considered. For example, the Prints and Photographs Divisions receives
architectural drawings, in the old days ink-on-paper, today as CAD-CAM
files. Should we normalize and "fix" these drawings by making a
raster-TIFF version? Keep the proprietary CAD-CAM file? Both?
Tough calls, and more a matter of preservation policy than
technology. But I sure am glad someone is wrestling with these questions
on our collective behalf, and look forward to hearing about Harvard's
actual experience over the next few years.
Carl Fleischhauer
Library of Congress
PS: we have had small test activity in which we have tried to implement
digiprov data capture in our evolving database software. The data is
harvested with an eye toward output into the digiprovMD schema on our web
site. This schema employs the elements and attributes identified in the
AES "process" document, although we cooked up the schema structure
ourselves (and no doubt our XML work can be improved upon). But our
implementation experience--in the database--suggested to us that this data
set is more complex than needed in our particular setting. We encountered
resistance from folks who were experimenting with inputting data.
Therefore, our team is working on a similarly conceived but streamlined
digiprov approach, together with creating some tools to aid data entry.
Once we get this cooked up well enough to share, we will provide
information to the METS community.
|