Some thoughts on file naming for audio materials.

But first, can you clarify whether you are speaking of a numbering
system for physical recordings such as cassettes, or a file naming
system for computer-based digital sound files? It sounds like you are
dealing with a collection of tapes, probably cassettes?

In practice, it is easiest and most efficient to store physically
tangible recordings (we can refer to them as sound "carriers" as
distinguished from what might be called the "sonic content") in simple
numerical sequences. It seems to work best for cassettes, DATs, CDs,
and open reel tapes to have their own separate format-based sequences.
Additions to a collection or series simply get the next highest number
in the sequence and are added at the end. A finding aid (database,
collection inventory, etc) can enumerate the actual carrier numbers
associated with a given collection or series. This is important and
helps alert users when a collection has recordings in several different

What you want to avoid is having to write complex identifying numbers
on carrier items and their containers. For one thing, most cassettes
and DATs have very little room for writing on shells and j-cards.
Writing on CDs and CD-Rs should be kept to a minimum because of
potential problems associated with writing directly on CD surfaces. In
theory, a simple unique number is all that should be needed to retrieve
and re-file recording carriers. The number is made unique by the
addition of a format code that can be either a prefix or suffix.

For example

CT003, etc.

DT003, etc.

There is a strong temptation to include additional clues to the content
by incorporating initials, dates, locations, project names, and so
forth. But these quickly can become unwieldy when dealing with all the
different format types and dimensions out there. One school of thought
suggests that you want all these indicators labeled on your carrier
items because if somehow the recordings were separated from an index or
database, there are still clues as to what the recording is and how to
link it back to other documentation that may exist. While that is, in
theory, a good argument for using a more complex compound numbering
system, I believe that in a library, archives, or other relatively
stable curatorial situation, the likelihood of recordings becoming
irrevocably separated from the master shelflist are rather slim--
assuming that databases and other support files are backed up and
stored offsite as is the recommended practice.

They say recordings collections are only as accessible as the
documentation that exists about them. I feel that a well-maintained
database can contain a wealth of information about the physical
carriers as well as the provenance and content and can point users and
curators easily and quickly to a unique, specific shelf location, so
that complex, compound numbering systems are unnecessary. Even with all
those extra initials, project code abbreviations, dates, etc. written
on the carrier, someone still has to be able to decode what it all
means, and that still falls back on external documentation that is
maintained in a file somewhere.

As for file naming of digital audio files on a computer down to the
track or segment level: Assuming you start with a physical carrier item
to begin with, and assuming that the carrier has a unique number like
DT541 or CT229, it is then easy enough to add on a track or sequential
item number to the file name, for instance: DT541.01. Again, you need
an external database or other type of computer file in which to
maintain information (metadata) about the individual track or segment.
It seems to me that long, compound file names on a computer simply
increases the likelihood of error in naming or searching for files, and
there may be limitations on the syntax of the filename as dictated by
the operating system.

When all is said and done, I have found that, when possible, keeping
things simple in the numbering, naming and labeling department makes
things that much easier to track and manage.

Hope this helps, and naturally I would be interested to hear other
ideas and points of view as well.

Best wishes,

Steve Green
Western Folklife Center
Elko, Nevada


On Nov 3, 2004, at 10:49 AM, Susan Hooyenga wrote:

> I'm posting this for a colleague on a linguistic project in Alaska:
> ------------------
> My question concerns file naming conventions. We are working to create
> an
> archive of the Dena'ina (Athabascan) Audio Collection, which contains
> a few
> hundred tapes, and associated transcription and alignment files. We
> need a file
> naming system for individual audio tracks (narratives) that addresses
> key
> identification information without being too unwieldy or too brief.
> Some of
> this information includes:
> -the ID number of the original tape in the collection
> -the name (or initials) of the speaker
> -the content of the narrative (ie ''tools'' or ''hunting moose'')
> Our main problem at the moment is deciding which bits of information
> should be
> part of the file name and which should be included in an index or some
> kind of
> metadata file.
> We'd very much appreciate any input or direction to any sources of
> information
> on file naming conventions and audio archiving.
> Andrea Berez
> ------------------
> I'll pass the answers on to Andrea - thanks!
> Susan Hooyenga
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.