----- Original Message ----- 
From: "Jon Noring" <[log in to unmask]>
> When I spent some time back in 2003 or so looking at discographical
> data (most of the controversy Francesco refers to is session data,
> not "artifact" data), it was clear that any type of discographical
> database has to allow alternative interpretations. That is, the
> ontology should allow different interpretations to sit side-by-side,
> so the end-user may decide for themselves which to use. Certainly, an
> authority organization, such as ARSC, can assign an estimate of
> reliability of the information, and of course the source of the
> discographical information would be recorded as well (e.g., "Brian Rust
> Jazz Records.")
> Now, to quickly answer another private reply, let me offer a heretical
> proposal: the "artifact" side of the database should record exactly,
> and only, the data the artifact itself provides. This includes
> misspellings, etc. No need to extrapolate or normalize. Normalization
> is done elsewhere and at a later time -- in some cases possibly in
> the session data, and in other cases in other specialized outside
> databases (e.g., song and composition compendiums.) Doing this makes
> life *so much easier* when transcribing and organizing the audio
> artifact data. Where things get complicated is when we try to hook an
> artifact to a session recording, and auxiliary information such as
> musician bios and song/composition information.
> (Still trying to figure out the best way to hook a 78 side which does
> not provide matrix information, such as some pre-ARC Brunswicks, to a
> given session. It can be done using unique identifiers, but still haven't
> figured out the specifics -- I really don't think we should transfer any
> data from the session data to fill out fields on the artifact
> side-of-the-database.)
My thoughts here...

1) In some cases, data can be obtained from still-extant recording ledgers.
Note that these ledgers (except for many Victor items) generally do NOT
provide session personnel data...often including vocalists. However,
they DO provide date and location of the recordings they document.
In part, this answers your question immediately above; the relevant
Brunswick ledgers DO exist, so matrix numbers can be entered by looking
for the sheet listing the title in question (except that if more than
one take was recorded, there is no reliable way of knowing which
take was issued...!). Where ledgers no longer exist (virtually all
minor/"indie" labels of the twenties) there is no way of knowing
(accurately, anyway) matrix numbers, dates or other session-
related data...and "educated best guesses," presumably prefixed
with "c." or "est." or equivalents...and/or data from common
sources (ADBD/CED/usw.) will have to suffice by default...!

2) Any discographic entity of whatever sort MUST list both the
extant and actual information in cases where both exist (and are
known to the compiler[s]). In some cases, what would appear to be
an error actually is not; for example, the initial Brunswick
recordings of "My Blue Heaven" are labelled as "Blue Heaven"...
and play very slightly different lyrics (..."When the whippoorwills
ARE calling...") which suggests they are actually the original
versions of the tune...! I have always used two separate fields
("ARTCRED" and "ACTART") to track recordings issued under 
pseudonyms or those with credit errors. This, in turn, means
I can query the database both for "Recordings on which Arthur
Fields sings" and "Recordings on which the vocalist is credited
as 'Mr. X'"...two entirely different questions! In fact, I can
even query for "All recordings on which 'Arthur Fields' is
credited as 'Mr. X'" should I need that specific data...!

3) IMO, the "wiki-db" should provide either (A) ALL available
discographic data relevant to a phonorecord (or side thereof)
with actual verified data items noted as such and "best guess"
entries likewise identified...OR (B) enough information to
identify a given phonorecord, along with (hyper?)links to
other relevant data thereon. It should also be possible to
query the database on any of its fields (including related
data tables in the database) and receive a list of all
phonorecords (including "None" if that is the case) which
fit the query's declared criteria. Regardless of how the
tables are set up, the results will be the same...the only
difference being in how many different tables the data is
stored! Note that my first discographic catalog database
was NOT relational, which often resulted in a large number
of empty data fields (which, in xBase, use as much space
as completed fields...!); however, in these days of 1TB
(and larger?) consumer hard drives, this is no longer a
consideration...or so I am told...?!

4) Are you suggesting that "songs" and "compositions" be
kept in separate (but relationally connected) tables?
Likewise, what are you referring to as "normalization?"
(the word has a specific meaning in the database "industry")

Steven C. Barr