However, this looks more like a workaround to me, as I understand a file
cannot contain other files. Is this understanding correct? Or am I
missing the actual point here?
A file can contain other files. If they can be
"transformed into standalone files without adding any additional
information, although a transformation process such as decompression,
decoding may have to be performed on the bitstream in the extraction
process. Examples of these
bitstreams include a TIFF within a tar file, or an encoded EPS within an
In the PREMIS data model these bitstreams are defined as “filestreams,”
that is, true files
embedded within larger files. Filestreams have all of the properties of
files, while bitstreams do
not. In the Data Dictionary, the column for “File” applies to both files
and filestreams. The
column for “Bitstream” applies to the subset of bitstreams that are not
filestreams and that adhere
to the stricter PREMIS definition of bitstream. The location
(contentLocation in the Data
Dictionary) of a file would normally be a location in storage; while the
location of a filestream or
bitstream would normally be the starting offset within the embedding
file." (p. 1-3, Data Dictionary )
If this is not interpreted as a compression (like mentioned in the onion
model) filestreams can be described as files. And their relationships
should in my opion be described - as you mentioned - through the
semantic unit relationship on the representation level.
But that is no real answer to your question. The crucial point is now,
if I understand you correct, where and how to embed this information.
The easiest way to embed the metadata in a structured form is in the
SIP/zip file. That is how it is handled in the kopal project. One would
not have any problems pointing to files in that case.
With the possibility to point into files (like METS fptr, see:
http://www.loc.gov/standards/mets/docs/mets.v1-5.html#fptr) one can keep
Metadata on the outside of the SIP/zip. That could also be done with the
definiton of a (bit more flat) structure in an own XML-schema to
identify different sections of spearated PREMIS XML data.
Another possibility might be the solution brought up by Youn. But I do
not know how to model that at this point in time.
That solution might even lead to problems, because it is open how
relationships might correlate. If you have an undefined (or even cyclic)
hierachy of relationships (e.g. some structural and derivative
relationships), how would one express the relations between these
relationships (and their meaning)? This is not yet defined in the PREMIS
Data Dictionary but may then become relevant for exchange of
information. My impression is that PREMIS is thougt in a bit more flat
and easier way, related to this relationship issue. But maybe I am wrong.
I hope I touched the right issue.
Georg Buechler schrieb:
> We've just been discussing a similar question, so thank you Youn for bringing this up!
> We decided to use PREMIS to record metadata in a digital preservation project with state archives in Switzerland. I should point out that this is really a very basic, pilot project, and that we are not using METS, which may be the cause of some of the problems we encountered. Namely, we are archiving data from a database-driven application. We plan to combine several plain-text data files, information about the data model, and other files into a zip-file. So we have a representation (the AIP in the current form) consisting of one file (the zip-file) which itself consists a couple of other files - there clearly is a hierarchy of files that is difficult to model in PREMIS. Example 3 in the PREMIS Data Dictionary (p. 3-34 sqq.) gives a hint of how to achieve this, namely through the use of the "relationship" semantic unit. However, this looks more like a workaround to me, as I understand a file cannot contain other files. Is this understanding correct? Or am I missing the actual point here?
> Thanks for any kind of clarification.
Goettingen State and University Library
Email: [log in to unmask]