----- Original Message ----- From: "Jon Noring" <[log in to unmask]> > When I spent some time back in 2003 or so looking at discographical > data (most of the controversy Francesco refers to is session data, > not "artifact" data), it was clear that any type of discographical > database has to allow alternative interpretations. That is, the > ontology should allow different interpretations to sit side-by-side, > so the end-user may decide for themselves which to use. Certainly, an > authority organization, such as ARSC, can assign an estimate of > reliability of the information, and of course the source of the > discographical information would be recorded as well (e.g., "Brian Rust > Jazz Records.") > > Now, to quickly answer another private reply, let me offer a heretical > proposal: the "artifact" side of the database should record exactly, > and only, the data the artifact itself provides. This includes > misspellings, etc. No need to extrapolate or normalize. Normalization > is done elsewhere and at a later time -- in some cases possibly in > the session data, and in other cases in other specialized outside > databases (e.g., song and composition compendiums.) Doing this makes > life *so much easier* when transcribing and organizing the audio > artifact data. Where things get complicated is when we try to hook an > artifact to a session recording, and auxiliary information such as > musician bios and song/composition information. > > (Still trying to figure out the best way to hook a 78 side which does > not provide matrix information, such as some pre-ARC Brunswicks, to a > given session. It can be done using unique identifiers, but still haven't > figured out the specifics -- I really don't think we should transfer any > data from the session data to fill out fields on the artifact > side-of-the-database.) > My thoughts here... 1) In some cases, data can be obtained from still-extant recording ledgers. Note that these ledgers (except for many Victor items) generally do NOT provide session personnel data...often including vocalists. However, they DO provide date and location of the recordings they document. In part, this answers your question immediately above; the relevant Brunswick ledgers DO exist, so matrix numbers can be entered by looking for the sheet listing the title in question (except that if more than one take was recorded, there is no reliable way of knowing which take was issued...!). Where ledgers no longer exist (virtually all minor/"indie" labels of the twenties) there is no way of knowing (accurately, anyway) matrix numbers, dates or other session- related data...and "educated best guesses," presumably prefixed with "c." or "est." or equivalents...and/or data from common sources (ADBD/CED/usw.) will have to suffice by default...! 2) Any discographic entity of whatever sort MUST list both the extant and actual information in cases where both exist (and are known to the compiler[s]). In some cases, what would appear to be an error actually is not; for example, the initial Brunswick recordings of "My Blue Heaven" are labelled as "Blue Heaven"... and play very slightly different lyrics (..."When the whippoorwills ARE calling...") which suggests they are actually the original versions of the tune...! I have always used two separate fields ("ARTCRED" and "ACTART") to track recordings issued under pseudonyms or those with credit errors. This, in turn, means I can query the database both for "Recordings on which Arthur Fields sings" and "Recordings on which the vocalist is credited as 'Mr. X'"...two entirely different questions! In fact, I can even query for "All recordings on which 'Arthur Fields' is credited as 'Mr. X'" should I need that specific data...! 3) IMO, the "wiki-db" should provide either (A) ALL available discographic data relevant to a phonorecord (or side thereof) with actual verified data items noted as such and "best guess" entries likewise identified...OR (B) enough information to identify a given phonorecord, along with (hyper?)links to other relevant data thereon. It should also be possible to query the database on any of its fields (including related data tables in the database) and receive a list of all phonorecords (including "None" if that is the case) which fit the query's declared criteria. Regardless of how the tables are set up, the results will be the same...the only difference being in how many different tables the data is stored! Note that my first discographic catalog database was NOT relational, which often resulted in a large number of empty data fields (which, in xBase, use as much space as completed fields...!); however, in these days of 1TB (and larger?) consumer hard drives, this is no longer a consideration...or so I am told...?! 4) Are you suggesting that "songs" and "compositions" be kept in separate (but relationally connected) tables? Likewise, what are you referring to as "normalization?" (the word has a specific meaning in the database "industry") Steven C. Barr