LISTSERV mailing list manager LISTSERV 16.0

Help for ARSCLIST Archives


ARSCLIST Archives

ARSCLIST Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ARSCLIST Home

ARSCLIST Home

ARSCLIST  March 2005

ARSCLIST March 2005

Subject:

Re: .wav file content information - chunks

From:

dave nolan <[log in to unmask]>

Reply-To:

Association for Recorded Sound Discussion List <[log in to unmask]>

Date:

Thu, 24 Mar 2005 12:09:48 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (322 lines)

Hello all - 

I am not enough of a "computer bits" whiz to know the real bit-by-bit blood
and guts of BWF metadata, but...

As best I can understand from the EBU BWF standards, there are numerous
"chunks" of metadata that are associated with BWF files - the "bext" (or
"mext") chunk which contains the internationally agreed upon minimum
metadata.  

Additionally there appears to be a way to define chunks that are unique to
the needs of the originating client/institution:

******************************************

(from 
http://www.ebu.ch/en/technical/publications/userguides/bwf_user_guide.php):

What is a "chunk"?
A "chunk" is a self contained collection of data in a RIFF file. It contains
a header, which gives its type and length, followed by data arranged in
fixed or variable length fields.

What is the extra chunk in the BWF?
The "Broadcast extension" chunk, coded "bext", is contained in all BWF
files. It contains the minimum information expected to be needed by all
applications in broadcast production. It contains information on the title,
origination, date, time, etc. of the audio content. A BWF file containing
MPEG audio data also includes a further extra chunk "mext".

Should it be BEXT or bext, MEXT or mext?
Fairly late in the development, it was decided that the lower case was
correct i.e. <bext> and <mext> not <BEXT> and <MEXT>. It appears that
IBM/Microsoft originally intended to use upper case for registered chunks.
However it seems that there have been no new chunks registered since 1994,
according to the documentation on the Microsoft website.

******************************************

Can I add extra information or metadata about my programme to a BWF file?
You can add any valid chunk to a BWF file. However, the extra chunks will
only be interpreted by applications which are programmed to do so. Other
applications will ignore the contents of these chunks. The EBU intends to
register a limited number of extra chunks for specific application in
broadcasting. The current types are given below.

What information do I put in the "bext" and "mext" chunks, and what form
should it have?
The "bext" and the "mext" chunks contain various fields of data. Many of
these fields are fully defined in the specification. For example, the Date
field in the "bext" chunk and the SoundInformation field in the "mext" chunk
have defined formats. For other fields, such as OriginatorReference, the
specification only covers the type of data and the length. However the EBU
Members have developed recommended formats for the data in some of these
fields. See below.

When a BWF file is imported from another software system, how should the
contents of the "bext" chunk be treated?
This depends on whether the audio software works with a file structure, or
is a system connected to a database. Different designs must be used in each
case.

If the receiving system is based on a file structure, but with no database,
selected fields from the "bext" chunk of a file incoming from another system
should be displayed in a pop-up window.

In an audio software system that works with a database, the audio files are
indexed and the metadata contained in the "bext" chunk is stored and
retrieved from the database. For an incoming file, the content of the
various fields of the "bext" chunk are interpreted and the corresponding
fields of the database in the receiving system, are updated accordingly.

How should the metadata from the "bext" chunk be stored in a user's
database?
Below are some examples of how manufacturers have inserted the information
from the "bext" chunk into the fields of their databases. The examples are
mainly taken from network based radio on-air systems with simple two-track
editing facilities:

Field
Comments

 Description
The 256 characters can be named "Title" or similar in the database. This
field holds the working name of the file. For instance "Summitbriefing".
This field should not  be the name of the file.

 Originator
This field can, for instance, be the name of the reporter or the producer of
the file, or the artist or orchestra, if it is a music recording. This field
can have the name "Reporter", "Producer", "Client" or "Artist" or similar in
the database, depending on the most common use for a majority of files in
the production area where the database is used.

 OriginatorReference
A format for this field is described below. This long string could be kept
as a separate Unique ID field in the database not  be the name of the file.

 OriginationDate
This field should be the date of the creation of the audio file. When the
actual original recording is being made, the date is retrieved from the
computer's clock and stored in this field.

 OriginationTime
This field should be treated similarly to the date. The time is that
retrieved from the real-time clock in the computer exactly when the
recording begins. This will make it possible to seek files based on date and
time-of-day. The accuracy depends on the stability of the real-time clock in
the computer used for recording the file. Date and time should be inserted
into the corresponding fields in the database. OriginationTime is not
necessarily the same information as the "time" field in the file directory.

 TimeReference
This field is a count from midnight in samples to the first sample of the
audio sequence. This number can be used for time code generation if the
audio section of the computer being used for recording the file has a very
accurate and time-stable sampling frequency. This feature might be omitted
from BWF files that are not used with accompanying video. If used, this
number can be transferred to and from the database.

 CodingHistory
A format for this field has been agreed but is not compulsory. The strings
for each stage in the coding history can be extracted and kept in field(s)
of the database. When the next copy of the file is being made, these fields
can be retrieved from the database and used to generate the coding history
field of the new "bext" chunk.

**************************************

I think that even small archives should be be able to move to a "sensible
data migration plan" as the cost of hard drives continues to plummet,
leaving audio CDs or data CDs/DVDs as access copies only.  Tape-based and
writable CD/DVD copies are turning out to have SO many problems with
longevity past 10 years that it seems that they are NOT good long-term
archival storage media.

I believe the best solution will be a decent xml-based system that allows
the entry of bext/mext metadata, user definition of institution-specific
metadata chunks, a decent GUI to parse the XML for metadata
entry/modification, and an automatic export/import of the metadata to and
from common database packages such as Microsoft Office or Filemaker Pro.

I have a friend who is a high muckity-muck in the web design/XML/library
finding aids world - it seems there is enough of a need for this software to
be written that his firm might just take it on...  Or, we could go the
sourceforge open-source route...

dave nolan
Nyc

p.s. - everyone have fun in Austin...

 
> Date:    Wed, 23 Mar 2005 18:10:15 -0500
> From:    "Richard L. Hess" <[log in to unmask]>
> Subject: Re: .wav file content information
> 
> Hello, John,
> 
> Yes, I hope we're not boring the majority of the list!
> 
> I'm afraid I'm going to have to cut this dialogue short at this point as
> I'm headed out the door to go to the ARSC conference -- and see some family
> and friends as well as do some errands along the way both going and coming.
> The most important one is Friday, seeing my 89 year old Dad in
> Pennsylvania. On the return, I pick up 24 channels of Dolby A, some logging
> recorders, and a Sony DASH digital player. I'll be at the mercy of dialup
> hotel networks for the next two weeks. A few have wireless which should be
> better.
> 
> At 03:57 PM 3/23/2005, John Spencer wrote:
>> I agree that this is a useful (and hopefully not too boring!) dialogue.
>> 
>> Let me hurl a few softballs back, and please, do understand that I agree
>> fundamentally with what you are saying.  As they say, "the devil is in the
>> details".
> 
> Oh yes! Definitely in the details. I was trying to provide a broad overview
> rather than get into devilish details
> 
> 
>> I truly believe this is a "crisis" for small archives, as the lack of
>> funding means that structured metadata gets pushed to the back of the bus
>> (or worse, OFF the bus).
> 
> That is definitely the case. The CD-R preservation route is the only thing
> that they can afford. The minute I start talking to archives about managed
> data storage, many (not all) archivists' eyes seem to glaze over. One of
> the things I'm looking for in these cases is an IT department that the
> archive can piggy-back onto. It's imperative that we get the mindset away
> from CDs on the shelf or hard drives on the shelf. Overall, when you
> include administrative (IT services) costs, it is far more cost effective
> to dump 2 TB of data into a 20 TB IT department than try to manage a
> separate 2 TB store (numbers are semi-random, but 2 TB of oral history is a
> fair amount).
> 
>>> At 12:40 PM 3/23/2005, John Spencer wrote:
>>>> Also, we've built these tools for our internal use, it's
>>>> certainly not that hard.
>>> 
>>> Right, but I think Scott addressed that and what we're trying to do here.
>>> Mounting heads and aligning tape machines isn't that hard for me, but lots
>>> of people don't do it themselves. Writing the software would be harder
>> for me.
>> 
>> Understood and agreed.  We have a number of data projects underway where the
>> archive is doing the "real work" (the actual transfers) and we're helping
>> out with the IT issues.
> 
> This might be useful to learn more about--if you're coming to ARSC we
> should try and sit down and talk about your services in this regard.
> 
>>> I think these tools are intended for smaller archives and people like me.
>>> Larger operations will require you to use the rigorous tools that they
>>> develop internally or purchase with rights management.
>> 
>> Here I must disagree.  If I were to share my collection of files with
>> another institution (small or large), I would have a problem if all present
>> metadata were modifiable.  DRM or not, the core information should not be
>> easily changeable.
> 
> This is all a matter of degree. The essence is modifiable unless we
> completely lock the file. If the essence is modifiable, then the metadata
> will be as well. So,  now we talk about degree. I do not see the
> modification as something that is done on a regular basis. I see these
> tools used much more for the creation and reading of metadata than
> modifying. The metadata I see embedded in files is not the type that should
> be modified.
> 
> 
>> This is another area of concern for me.  How can we assume that SANiP has
>> their metadata fields laid out in the same manner as ACHCN (Aboriginal
>> Cultural Heritage Centre of Nowhere in Particular)?  Sounds like there might
>> be some re-keying (or re-mapping, or crosswalks) of data, which is not my
>> favorite scenario.  The more times we re-type the same information, the
>> greater chance for error.  Are we talking about MARC records, DC metadata,
>> etc.?  The use of XML should remove many of these obstacles, but the same
>> cannot be said for those using Excel 95 to collect metadata!
> 
> No, I was always assuming that there was a structure that would be mappable
> either via field names (as used in Excel 95 etc.) or to be more modern, via
> XML.
> 
> I don't know what structured metadata system makes the most sense. I've
> been specifically avoiding that area of study for the moment. Yes in the
> generic sense of MARC that is what I had in mind, but the specific LoC MARC
> fields leave something to be desired for audio -- at least what I've seen.
> 
> 
>> I do agree that some metadata should reside in the header, as you could
>> always open the file up in hexadecimal and read it.  At our office, we call
>> this "catastrophic metadata" (or "CYA" metadata).  However, I'm somewhat
>> unsure of your meaning of "tied together".  Are you referring to
>> 1) a wrapper that can be opened automatically (like MXF), or
>> 2) the metadata and audio files reside on the same physical carrier, or
>> 3) all of the metadata would be in the BWF header?
> 
> "Tied together" means that there is one entity that is passed from A to B
> with essence and metadata.
> 
> (1) Yes MXF is an approach, but so is BWF as I understand it. Other than
> the semantic difference of wrapper vs. file, isn't what we're talking about
> with BWF and MXF very similar? Actually, I've been a fan of AAF for a long
> time--I wish it gained more traction.
> 
> (2) is an invitation to trouble IMHO.
> (3) Yes, that is what I'm talking about -- using BWF in a way similar to MXF.
> 
> 
>> Also, I was under the impression that many smaller archives don't have
>> "digital storage systems", hence the transitional migration to Gold CD-R (as
>> evidenced by various discussions on this list).
> 
> See above - yes, but it has to change.
> 
> Note, some snippage happened. Presumably anyone interested has the earlier
> posts as well.
> 
> Cheers,
> 
> Richard
> 
> Richard L. Hess                           email: [log in to unmask]
> Vignettes
> Media                           web:   http://www.richardhess.com/tape/
> Aurora, Ontario, Canada             (905) 713 6733     1-877-TAPE-FIX
> 
> ------------------------------
> 
> Date:    Wed, 23 Mar 2005 18:11:55 -0600
> From:    John Spencer <[log in to unmask]>
> Subject: Re: .wav file content information
> 
> Richard,
> 
> I will be at the ARSC Conference, and I am presenting a topic Sat. afternoon
> regarding digital archives.
> 
> I'll try and include some slides about a project currently underway.
> 
> John
> --
> John Spencer
> http://www.bridgemediasolutions.com/
> 
> 
>> From: "Richard L. Hess" <[log in to unmask]>
>> Reply-To: Association for Recorded Sound Discussion List <[log in to unmask]>
>> Date: Wed, 23 Mar 2005 18:10:15 -0500
>> To: [log in to unmask]
>> Subject: Re: [ARSCLIST] .wav file content information
>> 
>> Yes, I hope we're not boring the majority of the list!
>> 
>> I'm afraid I'm going to have to cut this dialogue short at this point as
>> I'm headed out the door to go to the ARSC conference
> 
> ------------------------------
> 
> End of ARSCLIST Digest - 21 Mar 2005 to 23 Mar 2005 (#2005-70)
> **************************************************************

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager