Hi Kelley,
I am hoping
this note is not too late to share some of my
experience. I
was trying to respond to your very
well thought-out questions earlier.Then
I got sick. I so appreciated the
opportunity of getting an insight look at the WPE at
OLAC with your presentation.
Firstly, GW as one of the Early Experiments (EE) has very
limited time,
experience and expertise to experiment with cutting edge
technology. In my
case, on top of testing the model, providing feedback, my
daily routine did not
stop. EE were asked to experiment the Bibframe model as
proposed (the model was released in November). Interestingly, though
no specific assignments
were given for the test, each took on different aspects,
approaches to the
project. As you see, members of the group looked at the big
elephant in the
room from various angles. In some ways, the outcomes and
discussions enriched
the process (I hope). This
process has helped me
personally to understand the intent and potential of
BibFrame (BF) a lot
better. It was gratifying as I got to know GW data a lot
better: such as
the 21% of invalid tags still exist in our ILS system!!
Yikes!
Secondly, your point of the tools and crosswalks. There were
various levels of
mappings and crosswalks done by the EE. Each set of mappings
was based on a
very selective set of records or tasks that each institution
decided to focus
on. In the December meeting, there were also discussions
with regards to tools. Plugged-in ones in particular. The
tools would facilitate experimentation by
interested libraries with the library's own data. Both LC
and OCLC alluded to a
possibility of providing such tools for libraries to
transform and examine their
data. The tools may be mounted onto respective
infrastructures (OCLC or LC)?!
Perhaps we will hear more updates at the Midwinter or little
before then?!
Thirdly, the ability to dig deeper into the data and parse
out the relationship
as you pointed out. There were a great deal of discussions
on the concept of
relationship, its definition, how best to parse the data in
the Bibframe
environment. Collection relationship (naturally with-in/out
a resource, such as
serials, series, In analytics, bound-with, even, TOC?, etc.)
for which
libraries have taken different approaches to express them in
MARC. In GW's data,
I found many older data coded in
a general note without subfield (such as Bound with). But,
with
consistent use of phraseology for certain type of material
and punctuations, it
helped when transforming MARC data to BF data. Consistent
punctuation most
certainly helps the transformation of data from current
encoding environment to
Work, Instance environment. The transforming exercise
during the
experiment helped to strategize prepping of data too.
Subgroups
were tasked to work on
various aspects. This point paper on relationship hopefully
may help us think through the relationships
on various levels as related to the BF environment of Work
and Instance.
Lastly,
the discussion of new data to
be created in BF did not take up much real estate as the
group has been trying
to nail down the definition of Work and Instance, and how
Authority related to
the two elements. Your point of born-BF data is an important
piece. Surely as
the community move forward, a group of professionals such as
you will see to this
as it has definitely an implication and influence to
establishing future workflow.
Thank you.
Subject: | [BIBFRAME] Thoughts on the direction of Bibframe |
---|---|
Date: | Mon, 7 Jan 2013 00:50:43 +0000 |
From: | Kelley McGrath <[log in to unmask]> |
To: | [log in to unmask] |
I thought I would share some thoughts I've had about the
recent exchanges (or lack thereof) on the Bibframe list
in case they're helpful to someone else. In another
thread someone asked, "What is the point of the current
exercise?" That question wasn't really satisfactorily
answered that I can see.
In the month since the "early experimentation code" to translate MARC to Bibframe was made available, there has been little substantive discussion. It seems to me that the biggest reason is that this process has effectively disenfranchised a huge percentage of the potential audience who lack the time or skills or inclination to set up their own tool. I would like to give a shout out to Karen Coyle for putting up a few examples for the rest of us. It seems to me that putting up even 10 or 12 well-chosen examples could have stimulated a lot of discussion. There's also the question of making the display accessible. I can more-or-less follow what Karen put up, but I suspect it will be harder for many catalogers. Just as the end user needs a pretty display so too will there need to be a human-friendly display of Bibframe for the cataloger. Sample records in such a display would probably be reassuring to many.
Alternatively, catalogers would get a lot from a plain table showing what's mapped from MARC and to where, as well as what's not being mapped. Something like what RDA did: http://access.rdatoolkit.org/document.php?id=jscmap2.
On a more fundamental level, I wonder why we are not only starting by testing transformation tools (which would seem to me to come near the end), but why we are starting with transformations at all. Of course, it's essential that MARC translate into Bibframe in some useful fashion, but it makes more sense to me to start the discussion with questions like "What should Bibframe do?" I find that with what I have seen so far, I am somehow missing the big picture, the shape of Bibframe. To me, the most important question is not "How do we translate MARC to linked data/RDF?" but "What should Bibframe look like to do as much as possible of what we want to do?" We would benefit from a more thorough analysis of what MARC is really doing now, such as the work that Karen Coyle describes in her article MARC21 as Data: A Start (http://journal.code4lib.org/articles/5468), as well as a list of current complaints and desires. What is MARC doing and what is it not doing that we want it to do—from the answers to these questions we should decide what Bibframe will do.
I am very interested in the potential of Bibframe to deal with things that MARC doesn't handle well (like anthologies and other items that include multiple works). How would you put data into it if you were doing it from scratch? How can we make Bibframe an improvement on MARC and what it can do? To take an easy example, once we are freed from the constraints of letters and numbers for subfields, we should do better than the woefully inadequately tagged:
245 00 $a Library = $b Bibliothek ; Every book its reader / $c directed by John J. Smith. Cataloging is fun : a short / directed by J. Johnson, Jr. and Anna Allen ; produced by Jane Jones.
Something like this would seem better (clearly this it totally not how you would construct this, but I think you can see the point of this quick example):
<first work>
<title> Library</title proper>
<parallel title> Bibliothek </parallel title>
<statement of responsibility> directed by John J. Smith </statement of responsibility>
</first work>
<second work>
<title> Every book its reader </title proper>
<statement of responsibility> directed by John J. Smith </statement of responsibility>
</second work>
<third work>
<title> Cataloging is fun </title proper>
<other title information> a short </other title information>
<statement of responsibility>
directed by J. Johnson, Jr. and Anna Allen
</statement of responsibility>
<statement of responsibility> produced by Jane
Jones </statement of responsibility>
</third work>
perhaps you could even have something like:
<statement of responsibility>
<function statement> directed by </function statement>
<name statement> J. Johnson, Jr. </name statement>
<conjunction> and </conjunction>
<name statement> Anna Allen </name statement>
</statement of responsibility>
Do we need parallel tracks where born-Bibframe records look a little different from translated-from-MARC records? Since it is unreasonable to expect a computer to reliably parse things like the above 245 field on the necessary scale, we'll need something to do with things like "$b Bibliothek ; Every book its reader." Perhaps some intermediate step, such as labeling it "<additional title info (other title info, parallel title; additional titles by same authors)>" and then cleaning it up later (if at all), would work. In most cases, a better job could be done with 245 $b conversion by taking into account punctuation, but I see little hope for $c.
I am also interested in the extensibility and hospitality of Bibframe. Despite all the complaints about the complexity of MARC, there are still areas where it lacks desirable granularity. In addition to the above lack of subfields in 245, there was recently a discussion on the OLAC list about the conflation of subtitles, captions and intertitles in $j. The sense was that at least that last should be coded separately and I think a case could be made for the first two as well. I suspect there are many such things lurking.
Kelley