Sorry for the confusions. I'll rephrase my question:
IMO I think we should not allow any container to have the contained
semantic units to be all optional. Reasons are:
1. According to the data dictionary p. 2-2,
"If a container unit is optional, but a semantic component within that
container is mandatory, the semantic component must be supplied if and
only if the container unit exists."
My understanding is that if we allow a container to have all the
semantic units conatined within to be optional, we basically say, even
if a container exists (meaning we do want to preserve this part of
information, otherwise we'd not include the container in the xml in the
first place), but all the contained semantic units are allowed to be
empty (meaning we don't have to preserve anything).
2. I agree the semantics cannot be infinitely accurate. At a certain
point we must resort to common sense. However in some important aspects
we may want to be as clear as possible. The schema defined here will
most possibly used for automation, during which all kinds of silly
machine errors may be introduced, which most likely are out of the not
so obvious logic and semantic obligations the programmers overlook. If
we capture these errors in the schema, we can avoid errors from being
introduced to the system. Actually codifying the obligations in the
creatingApplication is not too difficult in xml schema. Once we all
agree on the obligations I can give a possible schema excerpt.
3. Obligation is an important part of the data dictionary. I can see a
lot of discussions have been dedicated to this part to make the data
dictionary clearer. This idea of clarifying obligations in the all
optional elements containers actually comes from the "format" element,
which clearly defined the obligations of its two contained optional
subelements in the usage notes, p. 2-22:
"Either formatDesignation or at least one instance of formatRegistry is
required."
I think perhaps we may want to put similar efforts to the three elements
I mentioned before, by answering the question, if we do want to preserve
information about that container, what is/are the semantic unit(s) we
must include.
Thanks,
Zhiwu Xie
Graduate Research Assistant
Research Library
Los Alamos National Lab
On Thu, 2006-02-09 at 07:15, Charles Blair wrote:
> On Wed, Feb 08, 2006 at 01:32:32PM -0700, Zhiwu Xie wrote:
> > The following elements in PREMIS are allowed to be empty:
> >
> > - creatingApplication
> > - environment
> > - dependency
> >
> > >From the schema point of view this is no big deal, it allows an empty
> > element in the record. But from the semantic view this basically says:
> > you don't have to preserve A, but if you do want to preserve it, you
> > still can just give me an empty record.
>
> i don't understand this.
>
> > Perhaps a bit more elaboration on this part can help the semantics
> > to be clearer.
>
> could you clarify the question?
>
> > For example, if we do want to preserve the creatingApplication,
>
> the intent of the element wasn't to preserve the creating
> application. it was to provide information about the creating
> application.
>
> for example, if i'm preserving document A, which is of a particular
> document type, it's still possible that creating application B
> introduces some quirks into A. optionally knowing that B created A
> might help us better understand the format of A when it comes time to
> migrate it, for example.
>
> if, however, you want to preserve the creating application, then
> either creating application can't be applied, because you don't know
> it (you're dealing with commercial products), or, if it's something
> you wrote, then the creating application might be, say, a particular
> version of a C compiler.
>
> > perhaps at least we must preserve either the creatingApplicationName
> > or the dateCreatedByApplication, or both, otherwise some record may
> > only contain a creatingApplicationVersion number without specifying
> > who and when.
>
> i think that this is an interesting point. can we codify common sense
> here? if i say Name and Version and DateCreatedBy are all optional,
> does it makes sense only to record Version? no. what's the best way to
> address this?
>
> > In case of dependency, I think perhaps dependencyIdentifier needs to be
> > mandatory, because the dependencyName is just a hint.
> >
> > What do you think?
>
> i don't think so. you may have identified a dependency on something
> that isn't in your repository. also, the rationale for DependencyName
> is: "It may not be self-evident from the dependencyIdentifier what the
> name of the object actually is." so clearly it was thought that the
> name served as more than just a hint.
|