Hi all,
This is somewhat related to the spirit of what Markus has stated with respect to metadata being independent of the container format. A few weeks ago on this list, I requested [1] the addition of a mdRef or mdWrap child element that could appear directly under mets:amdSec, without having to use techMD, digiprovMD, sourceMD, or rightsMD. The end goal of this was to make it easier to point to external premis-as-root instances since none of the four existing children of amdSec possessed the proper semantics to address what PREMIS expresses as a whole. Understandably, as Jerry pointed out, this approach of adding elements could result in less interoperability. (Although I rather liked the mention of concepts like RDFa to address future problems.)
At the time that Markus sent this mail, I was simultaneously working on an example instance [2] that both achieved what I was looking for, but also adhered to Rebecca's best practices document with respect to the separation of PREMIS concepts across techMD and digiprovMD. Semantically it seems to me that they are the same document with respect to PREMIS. The only difference between my instance and the one Rebecca provided in the best practices example [3] is that I'm making use of mdRef versus mdWrap. I took Rebecca's MODS, MIX, and PREMIS content verbatim (except for adding namespace and schemaLocation info) and placed them into external files. For the PREMIS content I combined everything into one premis-as-root instance, but one could opt to use the externally referenced content as object-as-root, etc., in numerous files.
In my instance, mdRef with @xlink:href and @XPTR achieves pointing to the relevant nodes in the external documents. My METS instance does not validate because of an error in the @DMDID value in structMap, where Rebecca's instance IDREFs an ID inside the MODS document. Because I made my MODS external, I have no construct for getting at that ID as far as I know. See my documentation inside my instance where I call for the addition of attributes such as @DMDXPTR and @AMDXPTR on any element that also allows @DMDID and @AMDID. The mere addition of @xlink:href and @XPTR on these elements would work, too.
An added benefit to my experimentation of separating the metadata from the METS container was getting around the lax validation of mdWrap. I found that the PREMIS content in the Louis examples that Rebecca and I provided doesn't validate when used in an external file with the premis-as-root construct because of the issue that Olaf Brandt pointed out a few days ago regarding premis:eventOutcomeDetail not allowing for text as a child (which I assume is a xs:any with lax validation problem). In my instances I've always preferred strict validation, which is a reason that I've always tried to opt for referencing external files.
Admittedly, my motivation here has to do more with the current state of METS best practices than PREMIS best practices. With the existence of tools like XQuery and the document() function in XSLT/XPath, I personally see no reason to make instances so verbose, but wouldn't wish to enforce this belief on anyone. Of course the best practices document is in the right to show all content being placed in mdWrap since a good number (if not the majority) of METS implementers opt for mdWrap. But it should also allow for mdRef. A best practices document should remain somewhat neutral to the approach as long as they semantically and intellectually achieve the same goal. In my opinion we should now begin to reflect how those who implement XQuery, native XML DBs, and the document() function might choose to approach PREMIS implementation with METS containers.
[1] http://listserv.loc.gov/cgi-bin/wa?A2=ind0707&L=pig&T=0&P=772
[2] http://lcweb2.loc.gov/natlib/cred/premis/louis.xml
[3] http://www.loc.gov/standards/premis/louis.xml
Clay
>>> Markus Enders <[log in to unmask]> 08/17/07 11:28 AM >>>
Hi Rebecca,
thanks a lot for compiling the darft document. Just some comments from
me - just before the weekend starts ;-)
- I would not assume that the <premis:object> section contains mostly
technical data; therefore I would not vote for putting the
<premis_object> under techMD. This might probably be the case for files
and bitstreams, but for "representations" the situtation is differently.
Storing just relationship information between DocStructs (e.g
MetadataUpdate event occured) doesn't really contain technical information.
Anyhow: it seems that the Object.xsd contains several different kind of
information. Did you (the premis community) consider to extract the
relationship section and create an xsd of its own for it?
- I just wonder, if we shouldn't really talk about, how we may avoid
storing metadata redundantly. Especially in more dynamic scenarios where
content gets updated by different tools, by different workflows, it
might become unhandy to keep data consistent.
I'm not talking about changing the premis data model. One approach could
e.g. be to allow Xpath expressions stored in a separate
premis-attributes to point to different elements or attributes within
the same xml-file.
- I doubt, we can describe best-practises in a general way without
specifying the context. E.g. the idea of linking METS and premis using
IDREF (somehow nothing else than a very special XPath) always leads to a
mix of a container format (METS) and a metadata format (premis), which
might not be appropriate. I think , in general the usage of a metadata
format should be independent of a container format. I can imagine a lot
of scenarios in which METS is used as an ingest format - but internally
the data is split up: the metadata gets indexed, stored in different
database fields, gets embedded into other container formats etc - in
other words: gets separated from the container format. Mixing the
container format and the metadata format makes it hard to pull data out.
For that reason, the decision, how strong the container format and a
metadata format should be tied together is really specific to the usage
scenario / application. In theory they shouldn't be tied together at all.
Ciao
Markus
|