Print

Print


At 06:49 AM 6/17/2002 -0700, Merrilee Proffitt wrote:
>Well, so here's a question that reveals my ignorance.  What does TAR'ing
>and ZIP'ing up files do in terms of altering the bits?  Otherwise, I'm
>impressed by the simplicity of this proposal.

TAR'ing does nothing.  It creates a single file from a group of files, where
each file is stored as a set of 512 byte blocks (with an opening header
block in ASCII with information about that file).  See
http://www.mkssoftware.com/docs/man4/tar.4.asp for details.
GZIP'ing, on the other hand, like all compression algorithms radically
alters the bit stream; it just does so in a way that allows you to still
recreate the original bitstream.  GZIP uses a variation of the LZ77
algorithm (Ziv J., Lempel A., "A Universal Algorithm for Sequential Data
Compression," IEEE Transactions on Information Theory, Vol 23., No. 3,
pp. 337-343).  A good, relatively clear explanation of the algorithm can
be found at http://www.gzip.org/deflate.html


>Thanks,
>
>Merrilee
>
>At 08:32 AM 6/17/2002 -0400, you wrote:
>>On Fri, 14 Jun 2002, Andrew Hubbertz wrote:
>>
>> > I have a second Mets question here.
>> >
>> > In the OAIS Reference Model and the literature derived
>> > from it, there is frequent reference to 'information packages',
>> > which comprise content information and preservation
>> > description information, 'encapsulated and identifiable'
>> > by the packaging information.  In more recent literature,
>> > one hears of 'wrapping' digital objects.
>> >
>>At the Library of Congress, in the Audio-Visual Prototyping Project, we
>>have talked about literally wrapping metadata and bitstreams together,
>>usually thinking about TAR or ZIP or something, but not exactly the Base
>>64 stuff that Jerry wrote about.  We weren't at all sure about "keeping"
>>the encapsulated content that way for the long haul, but saw that it might
>>make sense (application depending) as an interface specification between
>>OAIS modules.  For example, our audio-visual group might have the
>>responsibility thru pre-ingestion, produce a SIP, and send it to a
>>different part of the Library to ingest and manage.  At one point, we
>>thought we might be the ingestors ourselves and actually make AIPs, which
>>we would send to the OAIS "storage and manage" team.  It was in this mood,
>>that our contractor wrote the paper:
>>http://lcweb.loc.gov/rr/mopic/avprot/AIP-Study_v19.pdf (July 2001,
>>somewhat out of date now).  But so far--as Jerry correctly reported--we
>>store the bitstream/essences/files in our UNIX filesystems and the METS
>>data includes pointers to them.  (Which has provoked its own discussion:
>>shall they be located by means of persistent names [like a URN]?)
>>
>>(If you are interested, the menu for my project's family of documents is
>>here:  http://lcweb.loc.gov/rr/mopic/avprot/avlcdocs.html)
>>
>>Best from Carl Fleischhauer
>>Library of Congress

Jerome McDonough
Digital Library Development Team Leader
Elmer Bobst Library, New York University
70 Washington Square South, 8th Floor
New York, NY 10012
[log in to unmask]
(212) 998-2425