MSU response to 'channels' We keep the notions of "channels" and "tracks" separate. We have not really addressed the issue of "tracks" in our audio extension. We define "tracks" as individual portions of a recording (analog or digital) prior to mixing (mastering), while "channels" refers to a signal that is already processed, either as simple mono/stereo, or other psychoacoustic processing, such as HRTF (head-related transfer function). The resulting digital audio file will have a certain number and configuration of channels (mono, stereo, 5.1, Dolby Digital (AC-3), Dolby ProLogic, etc.). For the sake of simplicity, we have initially decided to use just one attribute "channels" and limit its possible values with a closed set of controlled vocabulary. We agree, however, that it may be good to provide a finer-grain description of "channels", though, for instance, by using descriptions such as "5.1 AC-3", one can collapse LC channel-track-quantity and sound-field attributes. As far as multi-track recordings are concerned, we think of them as consisting of individual digital files. This is, indeed, how they are represented in the digital domain. With multi-track digital recordings, we are dealing with individual files (one for each track) and a metadata file that groups them together (synchronizes them) and contains various processing information, such as volume, pan, effect automation, etc. We have not accounted for that in our proposed extension. However, we agree that there a need to do that. The LC proposed attributes do not seem to adequately represent the multi-track system. Perhaps, we should come up with a new set of fields for such recordings? I can see, for instance, many situations whereby we will have to digitize a 16-track analog recording and digitally represent the mixing information. Have you dealt with such recordings yet? MSU response to 'bitrate' We think that "variable bitrate" is a possible value of the "bitrate" attribute, and may not require a separate field. MSU response to "filetype" vs. "fileformat" This is a very interesting question. Indeed, the file type can be encoded as MIME type, though we'd much rather keep it in the audio extension. File type is not the same as "internet_media_type". What we understand by "filetype" is, basically the way in which data is stored. It is a broader concept than the MIME type. It is often referred to in the literature as "file format", hence, probably, the confusion. The label is, of course, not important, and can be changed to something less ambiguous. Let me give an example of what we mean by "filetype". In addition to raw audio data (individual sample values), audio file types also contain control data. For example, a file can contain an edit decision list with timecode and crossfade information, as well as some processing data (e.g. equalization). Many such types use an introductory header that contains information such as sample rate, bitdepth, number of channels, compression, etc. Mac files, for instance, use a two-part structure with a data fork and a resource fork. Audio can be stored in either mode. The raw type that you asked about, contains only audio data - no header. It is a very popular format among audio engineers and speech scientist. Header information must be stored in a separate metadata file. What we mean by "fileformat" is the possible ways in which a particular file type can be encoded. For example the AIFF type supports many formats of compressed and uncompressed data, e.g., the AIFF-C file format is a version of AIFF that allows for compressed data. Several different types of compression can be used including MACE, and u-law. The WAV file type, is very similar, as it can contain comprise different formats. For instance, a WAV file can contain data encoded as PCM, MPEG-3, or ATRAC (minidisk compression). Our intention was to separate type from format in a simple, generalized way, without having to break the distinctions down into attributes such as byte order, header size, etc. This detailed information can be usually inferred from the basic type/format relationship. We thought we would limit the possible type/format pairs with controlled vocabulary. This type/format distinction also accounts for the problem with codecs that you mentioned. The codec is, in fact, a characteristic of file format. Your QuickTime is a good example. We could have "QuickTime" as the file type, and "Qualcomm PureVoice" as file format. This is NOT the finest grain classification, but we believe it is sufficient. Hope this helps! (MSU group:Bartek Plichta)