Hi Rebecca,
Rebecca S. Guenther wrote:
>>- I would not assume that the <premis:object> section contains mostly
>>technical data; therefore I would not vote for putting the
>><premis_object> under techMD. This might probably be the case for files
>>and bitstreams, but for "representations" the situtation is differently.
>>Storing just relationship information between DocStructs (e.g
>>MetadataUpdate event occured) doesn't really contain technical information.
>>
>>
>
>I don't understand what you mean in the last sentence.
>
Sorry, probably I did not express very well, what I meant. So let me
give you a (realistic) scenario:
We would like to preserrve an article, which is something premis would
assume to be "representation" layer. There is descriptive metadata
attached to this article. The article is represented by a <div> in a
mets-file. Besides the descriptive metadata, there is a premis section
as well. The <premis:object> identifier identifies this special version
of the article.
Over time, the descriptive metadata of this article will change: a typo
is corrected, metadata is converted to a new metadata format / schema
(something I think we'll do more often than converting images from one
format to the other ;-) - so what we do is: we create a new mets file,
containing a new <div> element representing the new version of the
article in the preservation system.
Of course I don't want to loose information about the metadata-update
event - and I don't want to loose information about the old version of
the article. A relationship in premis:object will link both version and
will attach an appropriate event.
Using it this way, I do not really see a lot of technical information in
premis:object at all. Therefor I would put it into mets:digiProvMD.
In case of files and bitstreams the situation might be different. I can
see, that techMD is a sensible solution.
>>- I just wonder, if we shouldn't really talk about, how we may avoid
>>storing metadata redundantly. Especially in more dynamic scenarios where
>>content gets updated by different tools, by different workflows, it
>>might become unhandy to keep data consistent.
>>I'm not talking about changing the premis data model. One approach could
>>e.g. be to allow Xpath expressions stored in a separate
>>premis-attributes to point to different elements or attributes within
>>the same xml-file.
>>
>>
>
>Sure, I think that we will see METS as a transfer format that gets
>ingested or disseminated and then the metadata stored various ways. Could
>you supply an example of how you might use Xpath expressions?
>
>
- I would extend the premis schema in a way, that every element, which
might contain a text value (a text node), may have an attribute called
e.g. valuePointer and a valuePointerType.
Example:
<premis:fixity valuePointerType="xpath"
valuePointer="mets:mets/mets:fileSec/mets:fileGrp/mets:file/@CHECKSUM"/>
(note: the element itself is empty).
Of course these XPath expressions can become more complex - getting the
5th file in the 2nd fileGroup or e.g the checksum of a file with a
special id:
<premis:fixity valuePointerType="xpath"
valuePointer="mets:mets/mets:fileSec/mets:fileGrp/mets:file[@ID="file01"]/@CHECKSUM"/>
If you like, you would even be able to use a relative path (something I
would not recommend).
>So I think you are speaking about the suggestion of using the PREMIS
>IDs with the METS IDrefs?
>
Right - or in general: we are talking about links from premis to METS.
If you plan to do this, you can do it in a more general way using XPath
as well: //*[@ID="fi1"] would point you to the element with ID="fi1".
> I think it may be a concern that the only
>way things are tied together are with these that only make sense within a
>METS document. We might want to look at how understandable these are if
>you store the PREMIS outside the METS document.
>
I feel, we should put up some criteria in general, when it is useful to
tie an extension schema to a container format in such a way that both
wouldn't make any sense if pulled apart from each other.
In this really depends on scenarios, on workflows how METS files are
created and how they are treated / stored afterwards.
>What does this mean for these best practices (or guidelines, or whatever
>you want to call them)? What would you change?
>
In the case of your example (linking from one premis section to another
(event to agent), I would use the premis internal IDs. In the end links
between sections are stored redundantly: once in METS (pointing from one
<div> to different amdSecs) and once in premis pointing from the event
to the agent.
Of course this is recommendation is based on the scenarious I have in mind:
producing METS/premis files in several steps (different programs are
used to extract data and add data from several sources to the METS file)
and store the data elsewhere (database) afterwards. METS files never get
changed; instead a new version of the mets file is created with a link
to the old one. In this case, redundancy is less an issue. Keeping data
consistent during the creation process is more important.
But we have to keep in mind, that METS files can be used totally
differently. If they get updated frequently, you mind find it more
useful to keep redundancy low (using XPATH, ID->IDREFs linking).
I doubt that we can have general best practises in this point.
Ciao
Markus
|