----- Original Message -----
From: "Steve Green" <[log in to unmask]>
> Some thoughts on file naming for audio materials.
> But first, can you clarify whether you are speaking of a numbering
> system for physical recordings such as cassettes, or a file naming
> system for computer-based digital sound files? It sounds like you are
> dealing with a collection of tapes, probably cassettes?
> In practice, it is easiest and most efficient to store physically
> tangible recordings (we can refer to them as sound "carriers" as
> distinguished from what might be called the "sonic content") in simple
> numerical sequences. It seems to work best for cassettes, DATs, CDs,
> and open reel tapes to have their own separate format-based sequences.
> Additions to a collection or series simply get the next highest number
> in the sequence and are added at the end. A finding aid (database,
> collection inventory, etc) can enumerate the actual carrier numbers
> associated with a given collection or series. This is important and
> helps alert users when a collection has recordings in several different
> What you want to avoid is having to write complex identifying numbers
> on carrier items and their containers. For one thing, most cassettes
> and DATs have very little room for writing on shells and j-cards.
> Writing on CDs and CD-Rs should be kept to a minimum because of
> potential problems associated with writing directly on CD surfaces. In
> theory, a simple unique number is all that should be needed to retrieve
> and re-file recording carriers. The number is made unique by the
> addition of a format code that can be either a prefix or suffix.
> For example
> CT003, etc.
> DT003, etc.
> There is a strong temptation to include additional clues to the content
> by incorporating initials, dates, locations, project names, and so
> forth. But these quickly can become unwieldy when dealing with all the
> different format types and dimensions out there. One school of thought
> suggests that you want all these indicators labeled on your carrier
> items because if somehow the recordings were separated from an index or
> database, there are still clues as to what the recording is and how to
> link it back to other documentation that may exist. While that is, in
> theory, a good argument for using a more complex compound numbering
> system, I believe that in a library, archives, or other relatively
> stable curatorial situation, the likelihood of recordings becoming
> irrevocably separated from the master shelflist are rather slim--
> assuming that databases and other support files are backed up and
> stored offsite as is the recommended practice.
> They say recordings collections are only as accessible as the
> documentation that exists about them. I feel that a well-maintained
> database can contain a wealth of information about the physical
> carriers as well as the provenance and content and can point users and
> curators easily and quickly to a unique, specific shelf location, so
> that complex, compound numbering systems are unnecessary. Even with all
> those extra initials, project code abbreviations, dates, etc. written
> on the carrier, someone still has to be able to decode what it all
> means, and that still falls back on external documentation that is
> maintained in a file somewhere.
> As for file naming of digital audio files on a computer down to the
> track or segment level: Assuming you start with a physical carrier item
> to begin with, and assuming that the carrier has a unique number like
> DT541 or CT229, it is then easy enough to add on a track or sequential
> item number to the file name, for instance: DT541.01. Again, you need
> an external database or other type of computer file in which to
> maintain information (metadata) about the individual track or segment.
> It seems to me that long, compound file names on a computer simply
> increases the likelihood of error in naming or searching for files, and
> there may be limitations on the syntax of the filename as dictated by
> the operating system.
> When all is said and done, I have found that, when possible, keeping
> things simple in the numbering, naming and labeling department makes
> things that much easier to track and manage.
> Hope this helps, and naturally I would be interested to hear other
> ideas and points of view as well.
> Steve Green
> Western Folklife Center
> Elko, Nevada
> On Nov 3, 2004, at 10:49 AM, Susan Hooyenga wrote:
> > I'm posting this for a colleague on a linguistic project in Alaska:
> > My question concerns file naming conventions. We are working to create
> > an
> > archive of the Dena'ina (Athabascan) Audio Collection, which contains
> > a few
> > hundred tapes, and associated transcription and alignment files. We
> > need a file
> > naming system for individual audio tracks (narratives) that addresses
> > key
> > identification information without being too unwieldy or too brief.
> > Some of
> > this information includes:
> > -the ID number of the original tape in the collection
> > -the name (or initials) of the speaker
> > -the content of the narrative (ie ''tools'' or ''hunting moose'')
> > Our main problem at the moment is deciding which bits of information
> > should be
> > part of the file name and which should be included in an index or some
> > kind of
> > metadata file.
> > We'd very much appreciate any input or direction to any sources of
> > information
> > on file naming conventions and audio archiving.
> > Andrea Berez
I deduce that these are not commercial tapes, but rather tapes recorded
privately by the previous owner of the collection.
If this is the case, you can assign each tape a unique number, which can
then serve as the primary key field in a database containing complete
(or as complete as possible) data on the related tape. This can include
data on when, where and why the tape was recorded, as well as by whom
if more than one source is involved; it can also include physical
data about the tape, as well as information on where the tape is
stored in your facility. This would be similar to a call number on
a book. The unique number should be affixed to both the tape itself and to
the case or other storage container.
The description suggests these are "spoken word" tapes...in which case the
date and place recorded, the speaker(s) and a brief description of the
content would be the most important data. If a single spoken event (i.e.
one interview or speech) oppupies more than one tape, this is when some
sort of "subsystem" (decimal, suffix letter, etc.) should be used. The
event will have its own number, and the sections identified by a
All of this should be tracked in a database file covering all tapes in
the collection (in fact, all tapes in any collection in your holdings).
Thus, if three tapes contain a long interview with one person on one
subject, they could be designated as 1001.01; 1001.02; 1001,03. Since
the tape number is used only to idebntify it, and to connect tables
in a relational database, it need not use any identification codes;
those would be accessible through viewing that table on the computer.
You can assign a coded number (as is done with call numbers) but
this is not necessary!
When you refer to "naming files," do you mean the database files
containing the information, or digital sound files made from the
tapes? In either case, the only information contained in the actual
filename should be that necessary to identify it; this could be
entirely arbitrary, as long as it related to an identifying
Steven C. Barr