Paul's point about ISRC codes is spot-on.

Plus, I'm not sure what Wired is advocating. Hopefully not another group cluster-f like Gracenote or 
cddb. These databases were cheap to create but are full of errors and are generally of low use to 
anyone wanting uniform naming, punctuation, etc in his music library.

Observations based on my experiences bringing a large CD collection into Catraxx software, which 
uses Gracenote:

1. the whole format of the Gracenote and Catraxx database was not a great fit for classical music. 
So how data was entered by the army of volunteers (who obviously have few or no paid editors) was 
almost at random. Similar works on the same label by the same orchestra and conductor would have 
very different field structures. All of this, of course, needed to be cleaned up by hand (line 
edited), which very much defeats the purpose of an "automated" database.

2. for jazz and rock, the database format is more comfortable, since these are the genres around 
which it was created. But, the data entry in Gracenote is so poor and non-uniform that almost every 
CD inserted required some level of line-editing. Only the most mainstream/popular/heavy-selling 
titles had database entries of good quality, I'm guessing because many people copy-edited and 
re-submitted the entry until it was correct.

3. such conventions as capitalization and punctuation are out the Wild West of anything goes with 
Gracenote. You insert a CD and you takes your chances. It's telling that the first thing that 
Catraxx does after a CD is inserted is go to the line-editing interface.

Ironically, CD Text was available from the beginning, I think. If not, soon after CD's hit the mass 
market. Yet, almost no record companies made use of the technology. These traditionally 
control-obsessed organizations missed an opportunity to have absolute control of their metadata. 
They should have standardized on conventions like capitalization and punctuation and standardized 
how they were going to fit classical works into a data-field system optimized for rock and pop works 
(ie disc artist, disc title, song title, song composer, etc). Instead, almost no major labels or 
artists utilized CD Text and thus the need for online databases that match CD technical parameters 
to volunteer-entered data. The other chance the record companies had to make this right would have 
been funding the database creation so they could control the quality of the data-entry, but in those 
days, to them anything online was enemy and potential loss of revenue.

As in all things involving data and facts, the quality of gracenote and cddb is exactly akin to the 
cost of collecting the data (ie low in both cases).

-- Tom Fine

----- Original Message ----- 
From: "Paul Turney, Sirensound Digital UK" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Tuesday, December 15, 2009 3:10 PM
Subject: Re: [ARSCLIST] Wired on the need for a single comprehensive music database



Paul Turney
CONFIDENTIALITY NOTICE: This email message (including any attachments)
is for the sole use of the intended recipient and may contain
confidential and privileged information

Sirensound Digital UK
Somerford House
22 Somerford Road
 ++44 (0) 1285 642289
Sirensound Digital UK

-----Original Message-----
From: Michael Biel [mailto:[log in to unmask]]
Sent: Tuesday, December 15, 2009 07:28 PM
To: [log in to unmask]
Subject: Re: [ARSCLIST] Wired on the need for a single comprehensive music database

By reinventing the wheel I mean that the geeks at Wired seem to thinkthat there have never been any 
listings of recordings before on-linemusic services, and that they probably have never heard the 
worddiscography or have ever seen one. The line about making a list ofsongs and using a "unique 
identifier" is what really set me off. "Musicservices already apply their own unique identifiers to 
songs in theircatalogs" and that these geeks think that unifying these numbers is THEway to go, is 
frightening. The records already have their own uniqueidentifiers -- they are the matrix numbers and 
the company catalognumbers, not numbers made up by the music services. I also doubt thatthey realize 
that there were recordings before CDs. I also wonder ifthey know that before computers there were 
things known as books. Andwhile most of the great discographies of the world are not on 
databasesthat can be seen on computer screens, I sometimes find it much easier todo research with 
eight or ten discographies spread out open on a tablethen having to click from window to window to 
window on a computerscreen. I agree that Rigler-Deutsch is a rats nets, because although mysignature 
is on the title page of the final report (I just happened tobe ARSC president when it was 
completed -- I didn't have anythingsubstantial to do with compiling it) I couldn't convince the 
librariansof the AAA who controlled the RDRI, that record collectors withoutlibrary science degrees 
might be capable of cleaning up the filescreated by the minimally trained data entry personnel. I 
had proposeddistributing the label films to expert collectors who could clean up thecomputer 
entries, but they didn't think that people who have spent theirentire lives with records could do 
things like "attempt a controlledvocabulary of songs, artists" etc. Yet that is what collectors 
havedone for more years than there have been library catalogs of records. Ihad a group of 
experienced collectors just salivating at the thought ofgetting a couple of films for a year or so 
and correcting the database-- for free, I might add -- but there was an aversion among 
theprofessionals to let the amateurs correct their work. The wheel was already invented by Brian 
Rust, Tom Lord, John Bruninx,Charles Delauney, Michel Ruppli, Frank Andrews, Pekka Gronow, 
TimBrooks, Ross Laird, Julian Morton Moses, Carl Kendzioria, Walt Allen,George Blacker, Malcolm 
Rockwell, Allen Koenigsburg, Steve Barr, RobertDixon, John Godrich, Tony Russel, Bill Moran, Ted 
Fagen, John Bolig,Reiner Lotz, and yes, Dick Spotswood among many, many others. And Imight add that 
most of these names are people who were from thecollectors community. Tom Lord's Jazz Discography is 
available on lineor CD-Rom, and Rockwell's and some of Ruppli's are on CD-Rom. There areall sorts of 
on-line discography projects including the Victor Project,Brian,, national 
discography projects in Sweden andseveral other countries, and many others that slip my mind right 
now. (Steve Barr can probably name them all.) There had been an ARSC computerized project about 10 
years ago that fellthrough, utilizing a program which made data entry unified and was goingto be 
based on existing discographies like those mentioned above. It iswhen "outsiders" like the Wired 
geeks recommend starting over with thesources being the on-line music service lists without knowing 
what hasalready been compiled by experts, that is most problematic. Mike Biel 
[log in to unmask] -------- Original Message --------From: Joel Bresler Hi, Mike, could you please 
amplify a bit on your answer? I thought theWired article was thought provoking. WorldCat, perhaps 
the largest repository of discographic information, isnot a database. It grows willy nilly, and 
there is no attempt at acontrolled vocabulary of songs, artists, and so forth. No tying togetherof 
78s with their re-release on LP, CD, etc. The Rigler-Deutsch databaseis a worthy try, but the 
contents are a rats nest. Dick Spottswood'swonderful EMOR is in print form only, and is not a 
database per se. If they are talking about reinventing the wheel, please point us to 
thewheel!!Thanks,JoelJoel Bresler, Message-----From: 
Michael BielReinventing the wheel. What is needed is a knowledge of DISCOGRAPHYamong these computer 
geeks who think that nothing has happened outsidetheir little world.Mike Biel 
[log in to unmask] Original Message --------From: "Schooley, John" Ways 
One Big Database Would Help Music Fans, Industry"The solution to this and other problems dogging the 
music industrycould be forehead-slappingly simple: one big, free, public databasewith, at the very 
least, song titles in one column and uniqueidentifiers in another. When online and mobile music 
services buildtheir own content databases out of the labels' catalogs, they would haveincentives to 
use the same numbers to identify each song, for thereasons laid out below. Music services already 
apply their own uniqueidentifiers to songs in their catalogs, so the use of numbers is not 
theissue - they just need to be the same numbers.This database would have to be free, readily 
available and totallytransparent, visible to music fans and industry people alike, becausethe 
barrier to entry for startups to use the system would have to bezero. Open source software making 
use of the data set, available on thesame website, might encourage services to use the numbers."