Hi Richard:
We'll have to agree to disagree, and hopefully no flaming ensues! ;)
I stick by my point that resources are limited and therefore the only practical solution is
cold-eyed curation by knowledgable humans.
-- Tom Fine
----- Original Message -----
From: "Richard L. Hess" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, January 28, 2016 1:34 PM
Subject: Re: [ARSCLIST] Why save all that data? (was Re: [ARSCLIST] LTO vs HDD)
> Tom,
>
> I disagree to some extent. I could not know what I would need from the ARSC List 10 years before I
> needed it. It takes time to curate and cull. A lot of time unless you are culling, for example,
> technically very flawed images, but even not technically great images of my grandmother with me on
> a trip I still remember I want to keep.
>
> And, looking at my mass ingest of images, I could have never curated the 60,000 easily prior to
> scanning. If I did reduce them by half, I would have spent much more time--and made more
> errors--than having very low cost labour scan the images in. I could then sort them and at some
> point may decided to delete some, but at this point, it would be a small percentage. Granted, few
> are magazine covers, but many illustrate details and are of interest (at least to me).
>
> It is a tough call. I'm not disagreeing with you about the need. I'm saying for me, the time
> constraints vs. the storage costs weigh heavily in not pre-culling.
>
> Cheers,
>
> Richard
>
>
>
> On 1/28/2016 1:11 PM, Tom Fine wrote:
>> Hi Richard:
>>
>> All of your points are making my argument for the need for human
>> curation and culling. It's simply not wise or possible to keep the whole
>> haystack, of anything. And, if human judgements are made, then human
>> craftsmen can have the necessary resources to do excellent transfers and
>> preservation, which is the best use of finite funding and time resources.
>>
>> -- Tom Fine
>>
>> ----- Original Message ----- From: "Richard L. Hess"
>> <[log in to unmask]>
>> To: <[log in to unmask]>
>> Sent: Thursday, January 28, 2016 11:41 AM
>> Subject: Re: [ARSCLIST] Why save all that data? (was Re: [ARSCLIST] LTO
>> vs HDD)
>>
>>
>>> Hi, Tom,
>>>
>>> This is a very complex question and probably has as many answers as
>>> questioners.
>>>
>>> I was reading an article from Scientific American about medical data
>>> security. It appears that medical data without personal identifiers
>>> has been widely circulated for years. It is useful for long term
>>> (longitudinal) studies to determine things like long term effects of
>>> drugs, radiation exposure, etc. While the person is anonymous, the
>>> point was made that through today's data mining techniques, it is
>>> relatively easy to point back to the individual from the comparing
>>> identified records with anonymous records. To be effective, the
>>> randomized data over time needs to be tied to an individual. You can't
>>> compare Joe's 1957 and Jim's 2003 data and achieve useful results. You
>>> have to compare Joe's 1957 and Joe's 2003 data to achieve useful
>>> results and with tracking that data over time, even though you do not
>>> know who Joe is, by comparing with other data, you can figure it out.
>>>
>>> So, that was the article thesis, but data mining on both sides of
>>> this question (anonymous and named) can create useful societal results.
>>>
>>> Part of the challenge in archives, as I understand it, is learning
>>> what is significant and what is chaff. It is difficult to predict what
>>> will be useful in the future.
>>>
>>> I will provide an example from my life of tape archiving. When I
>>> started to get heavily into this, I subscribed to many of the same
>>> mailing lists that you did. I found I didn't always know what I would
>>> need to know in the future, so I made a point of archiving all the
>>> posts rather than trying to figure out what was of interest and what
>>> was not. I then decided that there were several lists that had high
>>> enough traffic that I couldn't / shouldn't keep up, but I still kept
>>> all the posts. There have been several instances where some search
>>> terms have given me a post from one of those lists that helped answer
>>> a question I was having perhaps half a decade after the original post.
>>>
>>> It is easier and cheaper for me to keep it all (at least in regards to
>>> email) than to spend the time sorting. I spend enough time just
>>> sorting my general inbox every few months to keep it down to under
>>> 1000 messages.
>>>
>>> On the other hand, I do not keep every project I've done.
>>>
>>> From another perspective, my bank used to keep online data for me for
>>> six to eighteen months, depending on account type. They really, really
>>> wanted to stop mailing me paper statements, so one of the perks is
>>> that they keep the data now for seven years--conveniently the time I
>>> need to maintain records for the tax man. If I think of it, I will
>>> download my year's credit card and main checking account information,
>>> not so much to preserve it, I trust the bank to do that better than I
>>> can, but rather now at tax time, I can review all the charges, then
>>> search for the emailed invoice for many and make certain I have all
>>> the correct items to deduct as business expenses. Much faster, and
>>> saves space in file drawers and ultimately on storage shelves. Up
>>> until this year, we had been saving a "book box" or 10-ream paper box
>>> worth of paper data for accounting each year.
>>>
>>> Digital images the same: I generally keep most of what I shoot as I
>>> don't always shoot with a specific purpose other than, I LIKE THIS.
>>> So, different versions of the same image may pertain better than
>>> others to a later desire to create.
>>>
>>> While still in the 1-2 TB class, we are starting to see our local
>>> historical society's storage needs increase (10 years worth of data
>>> was kept on a 320 GB HDD along with the computer's OS and program
>>> files). I know have a 4 TB RAID-6 NAS unit there as we are adding
>>> video interviews that are part of our historic collection -- and
>>> retrospectively digitizing audio and video.
>>>
>>> Cheers,
>>>
>>> Richard
>>>
>>>
>>>
>>>
>>> On 1/28/2016 6:46 AM, Tom Fine wrote:
>>>> Which brings up the bigger issue -- do they need to keep all that data?
>>>> We've just seen what happens when the government "security" forces build
>>>> up a huge haystack -- they missed the needle in San Bernadino. I'm not
>>>> convinced about capturing huge amounts of any data, just to keep it.
>>>> This has been my argument about accumulating vs. collecting and
>>>> archiving. Just because something was put to media doesn't mean it's
>>>> worth preserving. In a world of limited resources, descisions need to be
>>>> made by humans as to what is worthy of the efforts and money involved in
>>>> digital preservation and storage. I would say the same is true of all
>>>> data. In our world, those decisions are by nature aesthetic and
>>>> sometimes political. It's how human culture works, it's constantly
>>>> curating the past and making value judgements, and media archives are
>>>> really just cultural archives. I think all of this will be even harder
>>>> in a generation or so, because so much media is being created and thrown
>>>> online every day. There's no effort involved in the "releasing"
>>>> mechanism anymore -- you just thrown your production online and see if
>>>> it sticks. Your "production" can be an artistic work such as music,
>>>> fictional video, fictional writing, an artfully crafted documentary, or
>>>> it can be pure opinion or noise or something in between. Any of it can
>>>> be "released" through the same filterless, no-cost mechanisms. In the
>>>> pre-Internet days, when mass media was manufactured physical media,
>>>> humans had to decide what projects to fund through the releasing
>>>> mechanisms, so some curating was taking place from the get-go.
>>>>
>>>> I know, I took this in a whole new direction ...I changed the subject
>>>> line. I'm interested if this line of thought is being addressed in
>>>> archiving circles and in schools where archivists learn their craft. My
>>>> thesis is that creation of data is easier and cheaper than ever, but
>>>> there are still major costs to archiving and storage, and thus more than
>>>> ever we need skilled curators to cut through the noise and garbage and
>>>> preserve what will matter in 100 years. For pre-digital stuff, there are
>>>> even greater preservation costs in time and money, so I think more
>>>> curating is necessary.
>>>>
>>>> -- Tom Fine
>>>>
>>>> ----- Original Message ----- From: "Corey Bailey"
>>>> <[log in to unmask]>
>>>> To: <[log in to unmask]>
>>>> Sent: Wednesday, January 27, 2016 8:57 PM
>>>> Subject: Re: [ARSCLIST] LTO vs HDD
>>>>
>>>>
>>>>> Hi Tom,
>>>>>
>>>>> The answer is relatively simple: Money
>>>>> You and I think about storage in terms of a Terabyte or two. General
>>>>> Motors and corporations of that size have to think in terms of
>>>>> multiple Peta-bytes. LTO becomes the least expensive method. After the
>>>>> data is on the tape, verification and migration is done robotically.
>>>>>
>>>>> Those that are considering LTO need to know that the format (drives,
>>>>> etc.) is only backward compatible for two generations and LTO-7 is on
>>>>> the horizon.
>>>>>
>>>>> Cheers!
>>>>>
>>>>> Corey
>>>>> Corey Bailey Audio Engineering
>>>>> www.baileyzone.net
>>>>>
>>>>> On 1/27/2016 4:36 PM, Tom Fine wrote:
>>>>>> <SNIP>
>>>>>> Could someone explain why a somewhat antiquated magnetic tape-based
>>>>>> storage system is preferable to several copies across several hard
>>>>>> drives? I just can't see any sense in using tape systems anymore for
>>>>>> data security, but I'm not a computer-storage expert, just a guy who
>>>>>> stores a lot of data.
>>>>>>
>>>>>> -- Tom Fine
>>>>>>
>>>>>> ----- Original Message ----- From: "Hood, Mark" <[log in to unmask]>
>>>>>> To: <[log in to unmask]>
>>>>>> Sent: Wednesday, January 27, 2016 6:41 PM
>>>>>> Subject: Re: [ARSCLIST] LTO vs HDD
>>>>>>
>>>>>>
>>>>>> Hi Richard,
>>>>>>
>>>>>> Thanks as always for sharing your experience and insights on all of
>>>>>> these
>>>>>> topics.
>>>>>>
>>>>>> Would you be comfortable sharing the make and model of the RAID-6 NAS
>>>>>> units you are using, and any comments about how well they have
>>>>>> performed
>>>>>> to your expectations?
>>>>>>
>>>>>> Thanks,
>>>>>> Mark
>>>>>>
>>>>>> Mark Hood
>>>>>> Associate Professor of Music
>>>>>> Department of Recording Arts
>>>>>> IU Jacobs School of Music
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 1/27/16, 3:36 PM, "Association for Recorded Sound Discussion
>>>>>> List on
>>>>>> behalf of Richard L. Hess" <[log in to unmask] on behalf of
>>>>>> [log in to unmask]> wrote:
>>>>>>
>>>>>>> Hi, All,
>>>>>>>
>>>>>>> I saw this thread and was going to ignore it, but decided not to
>>>>>>> once I
>>>>>>> found out that RDX was HDD-in-an-otterbox merci, Henri, and thanks
>>>>>>> for
>>>>>>> the image, Lou. Otters are wonderful--see "Ring of Bright Water" (The
>>>>>>> book) and Point Lobos State Park.
>>>>>>>
>>>>>>> LTO was around while I was still doing broadcast consulting and,
>>>>>>> at the
>>>>>>> time (late 1990s, early 2000s).
>>>>>>>
>>>>>>> I struggled long and hard about how to store things and realized if I
>>>>>>> were going to become involved with LTO, I would need two drives (how
>>>>>>> else can you be even remotely certain that your tapes are readable
>>>>>>> once
>>>>>>> your single drive dies--I certainly saw that in the early days of PC
>>>>>>> tape backup. At that point, the cost becomes excessive.
>>>>>>>
>>>>>>> My philosophy now is: Any data I want to keep does not live solely
>>>>>>> on a
>>>>>>> PC.
>>>>>>>
>>>>>>> I have two in-house RAID-6 NAS units, one backing up the other; an
>>>>>>> ammo
>>>>>>> case of 2.5-inch HDDs off-site (2 TB 2.5-inch USB 3.0 drives are
>>>>>>> pretty
>>>>>>> economical these days and are USB-powered).
>>>>>>>
>>>>>>> One son has been migrated to the cloud where Dropbox backs up and
>>>>>>> mirrors his two on-site laptops. Here, I harvest all new files
>>>>>>> (but not
>>>>>>> updates to prevent pollution of existing files) and store them on my
>>>>>>> RAID-6 NAS units to protect against a Dropbox failure or hacking. The
>>>>>>> other son will do it soon, but the first one is potentially going
>>>>>>> far
>>>>>>> away to school next fall for his Masters (Wichita and Edmonton are on
>>>>>>> the list) so I wanted to get some closer-in history with the system.
>>>>>>>
>>>>>>> RAID-6 allows the failure of any two disks without losing data and
>>>>>>> the
>>>>>>> data does not have to be chopped up into 1 or 2 TB chunks as it does
>>>>>>> with HDDs.
>>>>>>>
>>>>>>> I do not keep CF/SD cards, I copy and verify the copy and then
>>>>>>> recycle
>>>>>>> them.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Richard
>>>>>>>
>>>>>>> --
>>>>>>> Richard L. Hess email: [log in to unmask]
>>>>>>> Aurora, Ontario, Canada 647 479 2800
>>>>>>> http://www.richardhess.com/tape/contact.htm
>>>>>>> Quality tape transfers -- even from hard-to-play tapes.
>>>>>>
>>>>>
>>>>>
>>>>
>>> --
>>> Richard L. Hess email: [log in to unmask]
>>> Aurora, Ontario, Canada 647 479 2800
>>> http://www.richardhess.com/tape/contact.htm
>>> Quality tape transfers -- even from hard-to-play tapes.
>>>
>>>
>>
> --
> Richard L. Hess email: [log in to unmask]
> Aurora, Ontario, Canada 647 479 2800
> http://www.richardhess.com/tape/contact.htm
> Quality tape transfers -- even from hard-to-play tapes.
>
>
|