On 1/9/13 12:26 PM, Eric Miller wrote:
> On Jan 6, 2013, at 8:06 PM, Kelley McGrath <[log in to unmask]> wrote:
>> However, it turned out that there were a couple situations in which this model did not work so well.
>> One is when there are multiple works on a manifestation and the expression values (such as language) related to each work vary. There was no easy way in our model to represent this.
>> For example, the English and Spanish language version of Dracula from 1931 are often packaged together.
>> Work 1 Expression 1 Manifestation
>> Dracula (1931) English soundtrack DVD (1999)
>> English French subtitles 1 disc
>> ISBN 0783227450
>> Work 2 Expression 2 OCLC# 46829789
>> Dracula (1931) Spanish soundtrack
>> Spanish English and French subtitles
>> Without a separate expression level, it is unclear how to prevent the wrong connections from being made (work 1 has English subtitles or work 2 has an English soundtrack)
>> Work 1 Version
>> Dracula (1931) DVD (1999)
>> English 1 disc
>> ISBN 0783227450
>> Work 2 OCLC# 46829789
>> Dracula (1931) English soundtrack
>> Spanish French subtitles
>> Spanish soundtrack
>> English and French subtitles
> The fact you're separating these out as 2 separate "things" (wether you call it Work or Expression) is a critical step in supporting such disambiguation. MARC / AACR* conflates this and over time, various conventions have been introduced to try and minimize this ambiguity but, as you've pointed in the case of moving pictures, audio, etc. this is still a huge issue.
> Separating these Works out as first class resources is a first step. While the granularity of descriptive practices will be an issue, it should be noted that not everything need be described at once. If these Works are packaged together (and one wants to describe the package), we might think about this package as its own Work with its specific characteristics. The key here is to allow a model to evolve and allow contextual relationships that relate these Works together be introduced as needed.
I like the idea that the full granularity doesn't have to happen at the
outset -- either of the model or of the data. What I've been trying to
articulate for a while is a way that the "things" at different levels
could be created "on demand." That is, for a simple bibliographic item
it could all be in one graph; for Kelley's complex case, additional
graphs could be created that represent the Expression level that she
needs. This would also allow different communities to adopt practices
most appropriate to their resources and their users rather than forcing
everyone into the same mold. This means that "my" ideal model would
allow Work and Instance (or the whole of FRBR) to collapse into a single
simple graph, or to expand to as many "levels" as needed. The data,
rather than the data structure, would allow "Workness" or
"Expression-ness" to be inferred from the data based on the community's
What I obviously haven't figured out yet is how these different graphs
could interact usefully with each other, but I am hoping that having the
data as a graph will make that possible. Where it gets tricky, or at
least where I get stuck, is in managing relationships like "translation
of" -- FRBR only allows "translation of" between two Expressions; would
it be possible to have relationships that make sense when the
bibliographic data is expressed with a variety of entities?
I keep drawing pictures of this, and if I finally get one that seems to
work I will definitely share it. :-)
> In this case, I'd assert there are 3 separate Works (the original in
> japanese, the one dubbed into en-uk and the one dubbed into en-us
> which include the voices of various famous actors, etc.).
I wish we could get rid of the term "Work." I think it is the source of
confusion because we all seem to have strong ideas attached to that
word. If we called it "thing7" and allowed thing7's to either stand
alone or be joined into sets based on various criteria we might be
better able to reach agreement. Different people could use different
criteria for gathering their thing7's, although it would be best if they
made their criteria clear so that others could understand it if they want.
>> Expressions that consist of a cluster of related attributes are particularly important for musical expressions (performers, conductor, location, date, arrangement) and also some literary works.
>> It is also unclear to me whether it is possible to realize the full potential of RDA without the ability to encode all the FRBR group 1 entities separately.
>> I can see why the focus on translation from MARC led to the existing model. It is clearly the most practical approach for legacy data. Although many researchers have tried, no one has found an effective way to automate the identification of expressions in legacy data. It is not always possible even with manual review.
> Agreed. And that is why the translation from MARC is only one of several of the factors that went into the BIBFRAME design. For BIBFRAME we tried to balance the following:
> * Flexibility to accommodate future cataloguing domains, and entirely new use scenarios and sources of information
> * The Web as an architectural model for expressing and connecting decentralized information
> * Social and technical adoption outside the Library community
> * Social and technical deployment within the Library community
> * Previous efforts in expressing bibliographic material as Linked Data
> * Application of machine technology for mechanical tasks while amply accommodating the subject matter expert (the librarian) as the explicit brain behind the mechanics.
> * Previous efforts for modeling bibliographic information in the library, publishing, archival and museum communities
> * The robust and beneficial history and aspects of a common method of bibliographic information transfer
> - http://www.loc.gov/marc/transition/pdf/marcld-report-11-21-2012.pdf
> The current BIBFRAME list discussion as focused on the translation to MARC (i believe) simply because sample translation code has been made available. As cataloging use-cases, end-user scenarios (very important), vocabulary browsers, more tools, more examples, etc. are made available i anticipate a shift in the dialog.
>> However, it seems to me that Bibframe does need to support the separation of all the WEMI entities, as well as the best possible environment for entering new data going forward. Perhaps there could be some parallel way to allow the creation of a Bibframe work record for an expression with an instance record that only describes the manifestation and that is linked as follows:
>> Bibframe Work (FRBR work) --> Bibframe Work (FRBR expression) --> Bibframe Instance (FRBR manifestation)
> The above model is certainly accomplishable from a BIBFRAME perspective. The named relationships e.g "-->" however are critical. What we call these Classes is important, but more so are the relationships that contextualize them.
> (Thing -- hasExpression --> Thing) conveys some meaning. But if hasExpression is a high level, general relationship that is a surrogate for more useful detail, I'd encourage the use of richer relationships.
> (Thing -- hasTranslation | hasVariant | hasPart | isBasisFor, etc. --> Thing) conveys more useful and actionable context. In a Linked Data / Web environment, theses contextual relationships are key.
>> I also wonder how hardcoded the mapping of attributes to Bibframe classes is going to be.
> The initial code bases build their mappings from declarative mapping tables. Quick changes to these tables change the results. I would like to see this be abstracted away in place of a more configurable, end user interface to allow more customized, collection-specific mappings to be performed. Unfortunately, we're just not there yet.
>> For example, there was a post that suggested that actors would probably be mapped to instances.
> While different groups are exploring different ways of modeling this, In the current BIBFRAME model (and from my perspective) that would be incorrect. Actors (1xx, 7xx) would be defined as relationships contextualizing Works and People.
>> For film actors, this is counter to the approach that makes sense to the moving image cataloging community. The majority of film actors should be associated with the work. This also makes sense from the point of view of efficient data modeling since we want to reuse the list of actors from the work record in all instances rather than recording them redundantly at the instance level. Will there be any mechanism in Bibframe to accommodate differing viewpoints such as these?
> Yes (but in this particular case I think there is a shared viewpoint).
> Thanks for your insightful email. I hope this response helps.
> Eric Miller
> President, Zepheira "The Art of Data"
> http://zepheira.com/ tel:+1.617.395.0229
[log in to unmask] http://kcoyle.net