Print

Print


RE: [METS] Checksum

Carl,

In my opinion, checksums at the essence file level should be a secondary tool for file integrity in a digital repository.  I believe the primary tool for file integrity should be some form of EDAC applied to the AIP or its components.  Checksums tell you something is broke but they can't fix it.  EDAC such as Reed-Solomon not only tells you when something is amiss but can also fix single bit and burst errors as well.

Have a great weekend,
Neil

-----Original Message-----
From: Carl Fleischhauer [mailto:[log in to unmask]]
Sent: Friday, March 15, 2002 8:38 AM
To: [log in to unmask]
Subject: Re: [METS] Checksum


Thanks for the dialog, MacKenzie.  In our theorizing here at LC, we have
wished for a file integrity monitoring tool, and pictured a system that
checks and rechecks a file/object over time.  We seek reassurance about
the integrity of our objects over the long haul.

Now: how does that relate to METS?  Ummm.  If METS was the metadata for an
OAIS AIP, then there might be an argument that the "original" checksum (or
equivalent) is parked there, with the object.  The job of the monitoring
system would be to run comparisons and alert the owner when a change is
noticed.  In addition, you would probably want the system to keep a log of
when the comparisons were made.  Now, in such a system, is it useful or
needful to know the date when the original checksum (or equivalent) was
created?  I'd be curious to hear your thoughts.

Carl Fleischhauer
Library of Congress

On Thu, 14 Mar 2002, MacKenzie Smith wrote:

> In the better late than never category, I've tried to think of any problems
> with this
> proposal and can't, except the general caveat that it *is* possible to take
> generalization to an absurd extreme, however in this case it makes sense to me
> to go for a general solution over a specific one since I agree that there
> will be
> other checksum algorithms and we shouldn't make invalid presumptions.
>
> Could we, pehaps, have an optional attribute of checksumtype, and if that
> attribute is missing, but there is a checksum, assume it's an MD5?
>
> And what on earth would the benefit of a checksum create date be?
>
> MacKenzie/
>
> At 09:57 AM 3/5/2002 -0500, Robin Wendler wrote:
> >One of the general questions that came out of the recent
> >(and soon to be documented, I swear) meeting on technical metadata for
> >audio was about the METS <checksum> attribute of the <file> element.
> >Right now, this is defined explicitly as MD5, but there are now and
> >undoubtedly will continue to be other checksum algorithms in use. We were
> >wondering whether it would be better/possible to generalize this in METS,
> >providing for the checksum type, value, and create date. It seems better
> >to raise this now, rather before we hit the big Version 1.0.
> >
> >What do others think about this?
> >
> >-- Robin
> >
> >Robin Wendler  ........................     work  (617) 495-3724
> >Office for Information Systems  .......     fax   (617) 495-0491
> >Harvard University Library  ...........     [log in to unmask]
> >Cambridge, MA, USA 02138  .............
>
> MacKenzie Smith
> Associate Director for Technology
> MIT Libraries
> Building 14S-208
> 77 Massachusetts Avenue
> Cambridge, MA  02139
> (617)253-8184
> [log in to unmask]
>