My not-so-brief thoughts on Thomas's message:
> b. The Representation Information
> In the Reference Model, representation information
> is what is needed to
> make the data object comprehensible to
> a member of the target user group
> (or Designated Community). It is implicit
> in the discussion of representation
> information in section 2.2 (and discussed
> at length later in the Reference Model)
> that this includes both rules for
> structuring the bit stream and rules for
> (semantically) interpreting the
> structured bit stream. While maintaining
> software-as-representation is
> cited as an acceptable solution, it is
> viewed as inferior to preserving access to
> all information necessary to manually
> restructure and understand a data stream
> (if necessary).
>
> I interpret this to mean that the specs of
> every system that contributes structure
> to a given bit stream, including it's native
> architecture, OS, and application, should be
> contained in or referenced by an IP's representation
> information. We can rely on application
> software for convenience or if the specs
> are unavailable.
>
I'd like to restate your interpretation slightly in
the name of preserving the sanity of those
having to create the metadata. You need
to record the specs of every system that
contributes structure to a given bit stream
*only* to the extent necessary to make the
data object comprehensible. If there is
no difference between file format A as it is
produced on a Mac running OS X and as
it is produced on an Intel box running Win2K,
then you don't have to record the information.
You might *choose* to, in order to satisfy
the historical curiosity of digital paleographers,
but you don't have to. On the other hand,
if you're trying to produce an information
package for a video game, you might
very well have to record detail down to and
including the specifications of video processing
units for which the game had been optimized.
This rather wide range of detail that an OAIS
might capture with regards to Representation
Information is actually noted in the OAIS spec:
"Since a key purpose of an OAIS is to preserve
information for a Designated Community, the
OAIS must understand the Knowledge Base of
its Designated Community to understand the
minimum Representation Information that must
be maintained. The OAIS should then make a decision
between maintaining the minimum
Representation Information needed for its
Designated Community, or maintaining a
larger amount of Representation Information
that may allow understanding by a larger
Consumer community with a less specialized
Knowledge Base."
> METS does not seem to contain an
> explicit concept of representation
> information,
Well, as you noted, METS does provide,
through the behaviors section, the ability
to link to software necessary to render the
object and/or its parts for the user, and so
it does provide that one specific mechanism for
including Representation Information. Beyond
that, no, METS isn't explicit, it's implicit, and
that was a deliberate design choice. Because
there is and will be variation in the degree of
detail of Representation Information that any
OAIS chooses to record, the administrative
metadata sections are, as you put it, 'sockets'
that any OAIS is free to plug in the structures
it requires to record the Representation Information
it deems adequate. As you also noted, METS
does provide facilities for recording some
very minimal pieces of Rep. Info., such as
MIME type; these constitute
'least-common-denominator' Representation
Information that all of the early participants in
METS agreed should be recorded for most
any object. Anything beyond that needs to
be decided upon by the OAIS and slotted
with the 'socket' portions of METS.
> This does not read like
> a requirement to provide the information
> necessary to completely restructure the
> bit stream, and I'm not sure that the extension
> metadata sets provided by the Library of Congress
> give the level of detail necessary to do so.
The Library of Congress has defined the level
of Representation Information that they think
*they* need. Whether your OAIS agrees with
that is a matter of local choice.
> A home for information needed
> to interpret data in a METS docuement
> is not apparent.
If by 'data in a METS document' you mean the
bit streams within an FContent or referenced
by an FLocat section, some (minimal) information
is on the <file> element's attributes, more can
be included by referencing extension-defined
information in the technical metadata section,
and the <behavior> section can be used to
identify software needed to present the information
to a particular designated community.
> b. Context Information
> "how the Content Information relates to other
information
> outside the information package" (2-6)
>
> Context Information seems to be a superset of
METS' rights
> metadata. METS does not seem to have a
convenient place to
> store information
> about how a document relates to other documents.
>
No, I would not class Context Information as
a superset of rights metadata. The OAIS
reference model places Context Information
as a subcomponent of digital preservation
information, stating that Context Information
"would describe why the Content Information
was produced, and it may include a
description of how it relates
to another Content Information object that
is available." This clearly places Context Information
within the realm of the Digital Provenance portion
of a METS document.
> c. Provenance Information
> "describes the source of the Content
Information, who has had
> custody of it since its origination, and its
history (including
> processing history)"(2-6).
>
> The concept of Provenance information
accomodates both METS
> source metadata and digiprov metadata.
Together, source
> metadata and
> digiprov metadata are equivilent to
Provenance information.
>
Agreed.
> IV. Why this is important
> If METS and OAIS are not
> congruent standards, then the extra intellectual
> step of translation is required for work based
> on one standard to be used
> with the other. Additionally, incongruence will
> complicate and perhaps seriously hamper
> interoperability between archives that are
> based on "true OAIS" and archives
> based on METS.
While I don't disagree with the idea that METS
needs to be implemented in such a way that
it can support archives wishing to implement
OAIS-compliant systems, I think you're making
the mistake of assuming not only that there
is a 'true OAIS,' but that there is *one* true OAIS.
An OAIS is tasked with preserving information and
making it available for a *particular* designated
community. My designated community is NYU's
students, staff and faculty; Library of Congress
obviously has a somewhat different and larger
designated community, which in turn differs
quite a bit from an organization like ICPSR at
Univ. of Michigan. The types of Representation
Information we'll each record may vary widely
as we serving different communities *and*
serving them different information. In my
opinion, one of the overlooked facts of the
OAIS reference model is that it actually says
little or nothing about interoperability at all.
I would define METS as a first step towards
interoperability between archives that wish
to operate in compliance with the OAIS
reference model. If you look at section 2.2.3
of the OAIS reference model, you find this
interesting discussion: "It is necessary to distinguish
between an Information Package that is preserved
by an OAIS and the Information Packages that are
submitted to, and disseminated from, an OAIS.
These variant packages are needed to reflect
the reality that some submissions to an OAIS
will have insufficient Representation Information
or PDI to meet final OAIS preservation requirements.
In addition, these may be organized very different
from the way the OAIS organizes the information
it is preserving." Further along we find this: "The
Submission Information Package (SIP) is that
package that is sent to an OAIS by a Producer.
Its form and detailed content are typically negotiated
between the Producer and the OAIS." Implicit
in this discussion is the recognition that 1. there
is no standard for what a SIP will look like, and
perhaps more importantly 2. due to the differing
needs of organizations (in this case a Producer
and the OAIS), someone delivering a SIP to
an OAIS may not be able to supply information
that the OAIS requires. The OAIS will, therefore,
have to engage in an "extra intellectual
step of translation" to make the SIP useful.
METS does not completely eliminate this problem;
it can't. The problem is the result of the varying
social conditions and constraints under which
different OAIS will operate. I would argue that
it does help *alleviate* the problem, in that it
provides a common format for minimal information
needed in a SIP that we can all agree on, and a
structure to plug in the results of negotiations
between Producers and OAIS as to what a SIP
should look like in a specific instance. This in
turn simplifies the whole negotiation process
for any given OAIS, and by defining a single base
format for what SIPs/DIPs look like, allows us
as a community to share the cost of development
of tools to work with that format. To the extent
we can reach further agreement on what information
needs to be included in a SIP, we'll further
reduce the amount of one-off negotation required
every time someone submits information to an
OAIS.
To sum up, while I agree that it is important
that METS support OAIS work, I think it does
that quite adequately at the moment. I think
the real issue you're concerned with is
interoperability of METS between OAIS archives,
and the OAIS Reference Model is silent on how
to achieve that, leaving it to negotiation between
OAIS and Producers. This is not actually a
deficiency in the Reference Model; it has to
leave that space for negotiation in recognition
of the differing types of information that will be
stored in OAIS, the differing communities that will
be served, and the disparate sources that will be
contributing information to OAIS. However, it
means that furthering the goal of interoperability
of METS documents will be an on-going process
of discussion and negotiation between the users
of METS regarding what we agree is
essential information that we should all support,
and what needs to be left for local definition. The
current METS format defines what we've all
agreed on to date (including agreements regarding
what we can't currently agree on). As time goes on
and institutions using METS get more experience
with the format and in identifying their own local
needs, I suspect we'll identify further areas for
agreement and can start making some of the 'fuzzier'
sections of METS better defined.
So, the conversation on how to further interoperability
in METS is indeed crucial, and the main reason why
the METS initiative needs to continue. But we should
all be clear that interoperability and OAIS support
are two different things. At the moment, I think METS
provides quite good support for any institution wishing
to implement OAIS, but it does so by being rather
remarkably loose in defining what information needs
to be recorded. The biggest challenge will actually
be promoting interoperability by further defining what
information should be included in an Information
Package while retaining the flexibility that will be
needed if METS is to be used a variety
of OAIS contexts.
Jerry
|