I would suggest that PreservationLevel is distinct from
significantProperties and should be kept as separate as possible. Also
PreservationLevel is not the same thing as PreservationIntention. This
new breakdown of PreservationLevel is problematic.
Here's the wording on the wiki:
However, it is not clear whether the semantic unit describes the
intended level of preservation support (as per 'expected to be applied',
in the definition) or the current level of preservation capability (as
per the '"preservability" of the format') in the rationale).
Repositories have reported using this semantic unit in varying senses,
and the ambiguity may pose issues for interpretation among repositories
or between repositories and depositors.
It seems to me that this needs even more refinement. Is
preservationCapability what the archive was capable of at that time that
the PreservationIntention was assigned? If so, this is a time based
property and needs date/time as well. If not, if this the current
capability of the archive, then it should not be distributed out to
every item in the collection.
I think what is really needed here is not Preservation Capability but
rationale for assigning the Preservation Level.
Some use cases might help:
1) Archive ingests a file in format ABC version 1.5. At the time of
ingest, the archive had no tools for ABC 1.5 so it assigned a
preservation level of "Byte Preserve". The capability at the time was
only byte preserve (nothing else) and the promise made was byte
2) A year later, Archive has tools for format ABC version 1.5 and can
validate and migrate to format XYZ 1.0. The current capability is now
to support and migrate; the promise made was still byte preserve but can
now be upgrade to support and migrate. But that is a time sensitive
3) Archive ingests a file in format DEF version 1.0. The file is
defective. A preservation level of "byte preserve" is therefore
assigned. The capability of the archive is to support and migrate DEF
1.0; the reason that the preservation level or intention was assigned to
this file is because of the validity of the format, not because of
In thinking about use cases, it is essential to consider what you do
with bad data; there is lots of bad data out there.
Portico considered a preservation level of "byte-preserve pending" which
meant that we are byte preserving it now because we don't have any tools
yet. We discarded that because it rolled two facts together that are
I suggest that "PreservationCapability" is something that belongs as
global information about the archive, possibly in the format registry or
at a system level.
Preservation Level is a true property of each individual object. What
you need to go with it is information about why and when you assigned
that preservation level. That could be because that was the best the
archive could do at the time, or it could be because the file itself was
Evan Owens, Chief Technology Officer
[log in to unmask]
(609) 986 2224
100 Campus Drive, Suite 100
Princeton, NJ 08540