Dear Steve, Olaf and Zhiwu,
many thanks for your clarifications. I realise I'd better done some more reading before bothering the list, as my question seems to have been discussed by the PREMIS Working Group a long time ago...
From pp. 1-3 and 4-4ff. of the Data Dictionary I gather, then, that we should describe both the zip-File and the packaged files as separate objects. As to the right place to embed this information we are simply not enough advanced to have more than tentative opinions on this. We are currently thinking about embedding the metadata in the zip-File itself (and are looking to kopal and to other research for ways to achieve this), but will have to delay the implementation to a time when we know much more about our actual repositories than now. (We're still at the handicraft stage right now.)
Many thanks again,
Georg
> -----Ursprüngliche Nachricht-----
> Von: PREMIS Implementors Group Forum [mailto:[log in to unmask]] Im
> Auftrag von brandt
> Gesendet: Dienstag, 10. Oktober 2006 18:17
> An: [log in to unmask]
> Betreff: Re: [PIG] Upcoming revision of the PREMIS Data
> Dictionary - call for participation
>
> Hi Georg,
>
> you wrote:
> However, this looks more like a workaround to me, as I
> understand a file cannot contain other files. Is this
> understanding correct? Or am I missing the actual point here?
>
> A file can contain other files. If they can be "transformed
> into standalone files without adding any additional
> information, although a transformation process such as
> decompression, decryption, or decoding may have to be
> performed on the bitstream in the extraction process.
> Examples of these bitstreams include a TIFF within a tar
> file, or an encoded EPS within an XML file.
> In the PREMIS data model these bitstreams are defined as
> "filestreams,"
> that is, true files
> embedded within larger files. Filestreams have all of the
> properties of files, while bitstreams do not. In the Data
> Dictionary, the column for "File" applies to both files and
> filestreams. The column for "Bitstream" applies to the subset
> of bitstreams that are not filestreams and that adhere to the
> stricter PREMIS definition of bitstream. The location
> (contentLocation in the Data
> Dictionary) of a file would normally be a location in
> storage; while the location of a filestream or bitstream
> would normally be the starting offset within the embedding
> file." (p. 1-3, Data Dictionary )
>
> If this is not interpreted as a compression (like mentioned
> in the onion
> model) filestreams can be described as files. And their
> relationships should in my opion be described - as you
> mentioned - through the semantic unit relationship on the
> representation level.
> But that is no real answer to your question. The crucial
> point is now, if I understand you correct, where and how to
> embed this information.
> The easiest way to embed the metadata in a structured form is
> in the SIP/zip file. That is how it is handled in the kopal
> project. One would not have any problems pointing to files in
> that case.
>
> With the possibility to point into files (like METS fptr, see:
> http://www.loc.gov/standards/mets/docs/mets.v1-5.html#fptr)
> one can keep Metadata on the outside of the SIP/zip. That
> could also be done with the definiton of a (bit more flat)
> structure in an own XML-schema to identify different sections
> of spearated PREMIS XML data.
>
> Another possibility might be the solution brought up by Youn.
> But I do not know how to model that at this point in time.
> That solution might even lead to problems, because it is open
> how relationships might correlate. If you have an undefined
> (or even cyclic) hierachy of relationships (e.g. some
> structural and derivative relationships), how would one
> express the relations between these relationships (and their
> meaning)? This is not yet defined in the PREMIS Data
> Dictionary but may then become relevant for exchange of
> information. My impression is that PREMIS is thougt in a bit
> more flat and easier way, related to this relationship issue.
> But maybe I am wrong.
>
> I hope I touched the right issue.
>
> Olaf
>
> Georg Buechler schrieb:
> > We've just been discussing a similar question, so thank you
> Youn for bringing this up!
> >
> > We decided to use PREMIS to record metadata in a digital
> preservation project with state archives in Switzerland. I
> should point out that this is really a very basic, pilot
> project, and that we are not using METS, which may be the
> cause of some of the problems we encountered. Namely, we are
> archiving data from a database-driven application. We plan to
> combine several plain-text data files, information about the
> data model, and other files into a zip-file. So we have a
> representation (the AIP in the current form) consisting of
> one file (the zip-file) which itself consists a couple of
> other files - there clearly is a hierarchy of files that is
> difficult to model in PREMIS. Example 3 in the PREMIS Data
> Dictionary (p. 3-34 sqq.) gives a hint of how to achieve
> this, namely through the use of the "relationship" semantic
> unit. However, this looks more like a workaround to me, as I
> understand a file cannot contain other files. Is this
> understanding correct? Or am I missing the actual point here?
> >
> > Thanks for any kind of clarification.
> >
> > Best,
> > Georg
> >
> >
>
> --
> ---------------------------------------------------------
> Goettingen State and University Library
> Olaf Brandt
> Project kopal
> Tel.: +49-551-39-7805
> Email: [log in to unmask]
> http://www.sub.uni-goettingen.de/
> http://kopal.langzeitarchivierung.de/
>
|