I think Steven Barr's comments and my own earlier ones were both
addressing mainly the issue of assigning "call numbers" to physical
items, in this case tapes. Susan has now clarified that she and her
colleague are dealing with computer-based file naming issues rather
than tape numbering and labeling, so I apologize if I misunderstood
the original question and started drifting off on a tangent. I continue
to think that simplicity is best in any numbering or filing scheme,
whether on the shelf on on a computer, but I'll let someone else take a
stab at offering specific suggestions regarding the digital ID numbers
and keeping the audio files linked to the transcripts.
Susan, thanks for explaining further what you are working on there.
Good luck and best wishes,
Western Folklife Center
On Nov 3, 2004, at 1:03 PM, Susan Hooyenga wrote:
> Hi Steve,
> Thank you for your help, and I'm sorry we weren't clear about what
> sort of items
> we were naming. The tapes already have call numbers assigned by the
> Native Language Center; when they were digitized, they were assigned ID
> numbers, which may or may be kept in the next phase of the project.
> Right now
> Andrea is working on digital transcripts and time-alignment files, so
> trying to figure out the best scheme for identifying all of the files.
> Thanks again,
> Susan Hooyenga
> Quoting Steve Green <[log in to unmask]>:
>> Some thoughts on file naming for audio materials.
>> But first, can you clarify whether you are speaking of a numbering
>> system for physical recordings such as cassettes, or a file naming
>> system for computer-based digital sound files? It sounds like you are
>> dealing with a collection of tapes, probably cassettes?
>> In practice, it is easiest and most efficient to store physically
>> tangible recordings (we can refer to them as sound "carriers" as
>> distinguished from what might be called the "sonic content") in simple
>> numerical sequences. It seems to work best for cassettes, DATs, CDs,
>> and open reel tapes to have their own separate format-based sequences.
>> Additions to a collection or series simply get the next highest number
>> in the sequence and are added at the end. A finding aid (database,
>> collection inventory, etc) can enumerate the actual carrier numbers
>> associated with a given collection or series. This is important and
>> helps alert users when a collection has recordings in several
>> What you want to avoid is having to write complex identifying numbers
>> on carrier items and their containers. For one thing, most cassettes
>> and DATs have very little room for writing on shells and j-cards.
>> Writing on CDs and CD-Rs should be kept to a minimum because of
>> potential problems associated with writing directly on CD surfaces. In
>> theory, a simple unique number is all that should be needed to
>> and re-file recording carriers. The number is made unique by the
>> addition of a format code that can be either a prefix or suffix.
>> For example
>> CT003, etc.
>> DT003, etc.
>> There is a strong temptation to include additional clues to the
>> by incorporating initials, dates, locations, project names, and so
>> forth. But these quickly can become unwieldy when dealing with all the
>> different format types and dimensions out there. One school of thought
>> suggests that you want all these indicators labeled on your carrier
>> items because if somehow the recordings were separated from an index
>> database, there are still clues as to what the recording is and how to
>> link it back to other documentation that may exist. While that is, in
>> theory, a good argument for using a more complex compound numbering
>> system, I believe that in a library, archives, or other relatively
>> stable curatorial situation, the likelihood of recordings becoming
>> irrevocably separated from the master shelflist are rather slim--
>> assuming that databases and other support files are backed up and
>> stored offsite as is the recommended practice.
>> They say recordings collections are only as accessible as the
>> documentation that exists about them. I feel that a well-maintained
>> database can contain a wealth of information about the physical
>> carriers as well as the provenance and content and can point users and
>> curators easily and quickly to a unique, specific shelf location, so
>> that complex, compound numbering systems are unnecessary. Even with
>> those extra initials, project code abbreviations, dates, etc. written
>> on the carrier, someone still has to be able to decode what it all
>> means, and that still falls back on external documentation that is
>> maintained in a file somewhere.
>> As for file naming of digital audio files on a computer down to the
>> track or segment level: Assuming you start with a physical carrier
>> to begin with, and assuming that the carrier has a unique number like
>> DT541 or CT229, it is then easy enough to add on a track or sequential
>> item number to the file name, for instance: DT541.01. Again, you need
>> an external database or other type of computer file in which to
>> maintain information (metadata) about the individual track or segment.
>> It seems to me that long, compound file names on a computer simply
>> increases the likelihood of error in naming or searching for files,
>> there may be limitations on the syntax of the filename as dictated by
>> the operating system.
>> When all is said and done, I have found that, when possible, keeping
>> things simple in the numbering, naming and labeling department makes
>> things that much easier to track and manage.
>> Hope this helps, and naturally I would be interested to hear other
>> ideas and points of view as well.
>> Best wishes,
>> Steve Green
>> Western Folklife Center
>> Elko, Nevada
>> On Nov 3, 2004, at 10:49 AM, Susan Hooyenga wrote:
>>> I'm posting this for a colleague on a linguistic project in Alaska:
>>> My question concerns file naming conventions. We are working to
>>> archive of the Dena'ina (Athabascan) Audio Collection, which contains
>>> a few
>>> hundred tapes, and associated transcription and alignment files. We
>>> need a file
>>> naming system for individual audio tracks (narratives) that addresses
>>> identification information without being too unwieldy or too brief.
>>> Some of
>>> this information includes:
>>> -the ID number of the original tape in the collection
>>> -the name (or initials) of the speaker
>>> -the content of the narrative (ie ''tools'' or ''hunting moose'')
>>> Our main problem at the moment is deciding which bits of information
>>> should be
>>> part of the file name and which should be included in an index or
>>> kind of
>>> metadata file.
>>> We'd very much appreciate any input or direction to any sources of
>>> on file naming conventions and audio archiving.
>>> Andrea Berez
>>> I'll pass the answers on to Andrea - thanks!
>>> Susan Hooyenga
>>> This message was sent using IMP, the Internet Messaging Program.
> This message was sent using IMP, the Internet Messaging Program.