Print

Print


Tom,

I disagree to some extent. I could not know what I would need from the 
ARSC List 10 years before I needed it. It takes time to curate and cull. 
A lot of time unless you are culling, for example, technically very 
flawed images, but even not technically great images of my grandmother 
with me on a trip I still remember I want to keep.

And, looking at my mass ingest of images, I could have never curated the 
60,000 easily prior to scanning. If I did reduce them by half, I would 
have spent much more time--and made more errors--than having very low 
cost labour scan the images in. I could then sort them and at some point 
may decided to delete some, but at this point, it would be a small 
percentage. Granted, few are magazine covers, but many illustrate 
details and are of interest (at least to me).

It is a tough call. I'm not disagreeing with you about the need. I'm 
saying for me, the time constraints vs. the storage costs weigh heavily 
in not pre-culling.

Cheers,

Richard



On 1/28/2016 1:11 PM, Tom Fine wrote:
> Hi Richard:
>
> All of your points are making my argument for the need for human
> curation and culling. It's simply not wise or possible to keep the whole
> haystack, of anything. And, if human judgements are made, then human
> craftsmen can have the necessary resources to do excellent transfers and
> preservation, which is the best use of finite funding and time resources.
>
> -- Tom Fine
>
> ----- Original Message ----- From: "Richard L. Hess"
> <[log in to unmask]>
> To: <[log in to unmask]>
> Sent: Thursday, January 28, 2016 11:41 AM
> Subject: Re: [ARSCLIST] Why save all that data? (was Re: [ARSCLIST] LTO
> vs HDD)
>
>
>> Hi, Tom,
>>
>> This is a very complex question and probably has as many answers as
>> questioners.
>>
>> I was reading an article from Scientific American about medical data
>> security. It appears that medical data without personal identifiers
>> has been widely circulated for years. It is useful for long term
>> (longitudinal) studies to determine things like long term effects of
>> drugs, radiation exposure, etc. While the person is anonymous, the
>> point was made that through today's data mining techniques, it is
>> relatively easy to point back to the individual from the comparing
>> identified records with anonymous records. To be effective, the
>> randomized data over time needs to be tied to an individual. You can't
>> compare Joe's 1957 and Jim's 2003 data and achieve useful results. You
>> have to compare Joe's 1957 and Joe's 2003 data to achieve useful
>> results and with tracking that data over time, even though you do not
>> know who Joe is, by comparing with other data, you can figure it out.
>>
>> So, that was the article thesis,  but data mining on both sides of
>> this question (anonymous and named) can create useful societal results.
>>
>> Part of the challenge in archives, as I understand it, is learning
>> what is significant and what is chaff. It is difficult to predict what
>> will be useful in the future.
>>
>> I will provide an example from my life of tape archiving. When I
>> started to get heavily into this, I subscribed to many of the same
>> mailing lists that you did. I found I didn't always know what I would
>> need to know in the future, so I made a point of archiving all the
>> posts rather than trying to figure out what was of interest and what
>> was not. I then decided that there were several lists that had high
>> enough traffic that I couldn't / shouldn't keep up, but I still kept
>> all the posts. There have been several instances where some search
>> terms have given me a post from one of those lists that helped answer
>> a question I was having perhaps half a decade after the original post.
>>
>> It is easier and cheaper for me to keep it all (at least in regards to
>> email) than to spend the time sorting. I spend enough time just
>> sorting my general inbox every few months to keep it down to under
>> 1000 messages.
>>
>> On the other hand, I do not keep every project I've done.
>>
>> From another perspective, my bank used to keep online data for me for
>> six to eighteen months, depending on account type. They really, really
>> wanted to stop mailing me paper statements, so one of the perks is
>> that they keep the data now for seven years--conveniently the time I
>> need to maintain records for the tax man. If I think of it, I will
>> download my year's credit card and main checking account information,
>> not so much to preserve it, I trust the bank to do that better than I
>> can, but rather now at tax time, I can review all the charges, then
>> search for the emailed invoice for many and make certain I have all
>> the correct items to deduct as business expenses. Much faster, and
>> saves space in file drawers and ultimately on storage shelves. Up
>> until this year, we had been saving a "book box" or 10-ream paper box
>> worth of paper data for accounting each year.
>>
>> Digital images the same: I generally keep most of what I shoot as I
>> don't always shoot with a specific purpose other than, I LIKE THIS.
>> So, different versions of the same image may pertain better than
>> others to a later desire to create.
>>
>> While still in the 1-2 TB class, we are starting to see our local
>> historical society's storage needs increase (10 years worth of data
>> was kept on a 320 GB HDD along with the computer's OS and program
>> files). I know have a 4 TB RAID-6 NAS unit there as we are adding
>> video interviews that are part of our historic collection -- and
>> retrospectively digitizing audio and video.
>>
>> Cheers,
>>
>> Richard
>>
>>
>>
>>
>> On 1/28/2016 6:46 AM, Tom Fine wrote:
>>> Which brings up the bigger issue -- do they need to keep all that data?
>>> We've just seen what happens when the government "security" forces build
>>> up a huge haystack -- they missed the needle in San Bernadino. I'm not
>>> convinced about capturing huge amounts of any data, just to keep it.
>>> This has been my argument about accumulating vs. collecting and
>>> archiving. Just because something was put to media doesn't mean it's
>>> worth preserving. In a world of limited resources, descisions need to be
>>> made by humans as to what is worthy of the efforts and money involved in
>>> digital preservation and storage. I would say the same is true of all
>>> data. In our world, those decisions are by nature aesthetic and
>>> sometimes political. It's how human culture works, it's constantly
>>> curating the past and making value judgements, and media archives are
>>> really just cultural archives. I think all of this will be even harder
>>> in a generation or so, because so much media is being created and thrown
>>> online every day. There's no effort involved in the "releasing"
>>> mechanism anymore -- you just thrown your production online and see if
>>> it sticks. Your "production" can be an artistic work such as music,
>>> fictional video, fictional writing, an artfully crafted documentary, or
>>> it can be pure opinion or noise or something in between. Any of it can
>>> be "released" through the same filterless, no-cost mechanisms. In the
>>> pre-Internet days, when mass media was manufactured physical media,
>>> humans had to decide what projects to fund through the releasing
>>> mechanisms, so some curating was taking place from the get-go.
>>>
>>> I know, I took this in a whole new direction ...I changed the subject
>>> line. I'm interested if this line of thought is being addressed in
>>> archiving circles and in schools where archivists learn their craft. My
>>> thesis is that creation of data is easier and cheaper than ever, but
>>> there are still major costs to archiving and storage, and thus more than
>>> ever we need skilled curators to cut through the noise and garbage and
>>> preserve what will matter in 100 years. For pre-digital stuff, there are
>>> even greater preservation costs in time and money, so I think more
>>> curating is necessary.
>>>
>>> -- Tom Fine
>>>
>>> ----- Original Message ----- From: "Corey Bailey"
>>> <[log in to unmask]>
>>> To: <[log in to unmask]>
>>> Sent: Wednesday, January 27, 2016 8:57 PM
>>> Subject: Re: [ARSCLIST] LTO vs HDD
>>>
>>>
>>>> Hi Tom,
>>>>
>>>> The answer is relatively simple: Money
>>>> You and I think about storage in terms of a Terabyte or two. General
>>>> Motors and corporations of that size have to think in terms of
>>>> multiple Peta-bytes. LTO becomes the least expensive method. After the
>>>> data is on the tape, verification and migration is done robotically.
>>>>
>>>> Those that are considering LTO need to know that the format (drives,
>>>> etc.) is only backward compatible for two generations and LTO-7 is on
>>>> the horizon.
>>>>
>>>> Cheers!
>>>>
>>>> Corey
>>>> Corey Bailey Audio Engineering
>>>> www.baileyzone.net
>>>>
>>>> On 1/27/2016 4:36 PM, Tom Fine wrote:
>>>>> <SNIP>
>>>>> Could someone explain why a somewhat antiquated magnetic tape-based
>>>>> storage system is preferable to several copies across several hard
>>>>> drives? I just can't see any sense in using tape systems anymore for
>>>>> data security, but I'm not a computer-storage expert, just a guy who
>>>>> stores a lot of data.
>>>>>
>>>>> -- Tom Fine
>>>>>
>>>>> ----- Original Message ----- From: "Hood, Mark" <[log in to unmask]>
>>>>> To: <[log in to unmask]>
>>>>> Sent: Wednesday, January 27, 2016 6:41 PM
>>>>> Subject: Re: [ARSCLIST] LTO vs HDD
>>>>>
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Thanks as always for sharing your experience and insights on all of
>>>>> these
>>>>> topics.
>>>>>
>>>>> Would you be comfortable sharing the make and model of the RAID-6 NAS
>>>>> units you are using, and any comments about how well they have
>>>>> performed
>>>>> to your expectations?
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>> Mark Hood
>>>>> Associate Professor of Music
>>>>> Department of Recording Arts
>>>>> IU Jacobs School of Music
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 1/27/16, 3:36 PM, "Association for Recorded Sound Discussion
>>>>> List on
>>>>> behalf of Richard L. Hess" <[log in to unmask] on behalf of
>>>>> [log in to unmask]> wrote:
>>>>>
>>>>>> Hi, All,
>>>>>>
>>>>>> I saw this thread and was going to ignore it, but decided not to
>>>>>> once I
>>>>>> found out that RDX was HDD-in-an-otterbox merci, Henri, and thanks
>>>>>> for
>>>>>> the image, Lou. Otters are wonderful--see "Ring of Bright Water" (The
>>>>>> book) and Point Lobos State Park.
>>>>>>
>>>>>> LTO was around while I was still doing broadcast consulting and,
>>>>>> at the
>>>>>> time (late 1990s, early 2000s).
>>>>>>
>>>>>> I struggled long and hard about how to store things and realized if I
>>>>>> were going to become involved with LTO, I would need two drives (how
>>>>>> else can you be even remotely certain that your tapes are readable
>>>>>> once
>>>>>> your single drive dies--I certainly saw that in the early days of PC
>>>>>> tape backup. At that point, the cost becomes excessive.
>>>>>>
>>>>>> My philosophy now is: Any data I want to keep does not live solely
>>>>>> on a
>>>>>> PC.
>>>>>>
>>>>>> I have two in-house RAID-6 NAS units,  one backing up the other; an
>>>>>> ammo
>>>>>> case of 2.5-inch HDDs off-site (2 TB 2.5-inch USB 3.0 drives are
>>>>>> pretty
>>>>>> economical these days and are USB-powered).
>>>>>>
>>>>>> One son has been migrated to the cloud where Dropbox backs up and
>>>>>> mirrors his two on-site laptops. Here, I harvest all new files
>>>>>> (but not
>>>>>> updates to prevent pollution of existing files) and store them on my
>>>>>> RAID-6 NAS units to protect against a Dropbox failure or hacking. The
>>>>>> other son will do it soon,  but the first one is potentially going
>>>>>> far
>>>>>> away to school next fall for his Masters (Wichita and Edmonton are on
>>>>>> the list) so I wanted to get some closer-in history with the system.
>>>>>>
>>>>>> RAID-6 allows the failure of any two disks without losing data and
>>>>>> the
>>>>>> data does not have to be chopped up into 1 or 2 TB chunks as it does
>>>>>> with HDDs.
>>>>>>
>>>>>> I do not keep CF/SD cards, I copy and verify the copy and then
>>>>>> recycle
>>>>>> them.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Richard
>>>>>>
>>>>>> --
>>>>>> Richard L. Hess                   email: [log in to unmask]
>>>>>> Aurora, Ontario, Canada                             647 479 2800
>>>>>> http://www.richardhess.com/tape/contact.htm
>>>>>> Quality tape transfers -- even from hard-to-play tapes.
>>>>>
>>>>
>>>>
>>>
>> --
>> Richard L. Hess                   email: [log in to unmask]
>> Aurora, Ontario, Canada                             647 479 2800
>> http://www.richardhess.com/tape/contact.htm
>> Quality tape transfers -- even from hard-to-play tapes.
>>
>>
>
-- 
Richard L. Hess                   email: [log in to unmask]
Aurora, Ontario, Canada                             647 479 2800
http://www.richardhess.com/tape/contact.htm
Quality tape transfers -- even from hard-to-play tapes.