Hi Kelley,

I am hoping this note is not too late to share some of my experience. I was trying to respond to your very well thought-out questions earlier.Then I got sick. I so appreciated the opportunity of getting an insight look at the WPE at OLAC with your presentation. 

Firstly, GW as one of the Early Experiments (EE) has very limited time, experience and expertise to experiment with cutting edge technology. In my case, on top of testing the model, providing feedback, my daily routine did not stop. EE were asked to experiment the Bibframe model as proposed (the model was released in November).  Interestingly, though no specific assignments were given for the test, each took on different aspects, approaches to the project. As you see, members of the group looked at the big elephant in the room from various angles. In some ways, the outcomes and discussions enriched the process (I hope).  This process has helped me personally to understand the intent and potential of BibFrame (BF) a lot better.  It was gratifying as I got to know GW data a lot better: such as the 21% of invalid tags still exist in our ILS system!!  Yikes!  

Secondly, your point of the tools and crosswalks. There were various levels of mappings and crosswalks done by the EE. Each set of mappings was based on a very selective set of records or tasks that each institution decided to focus on. In the December meeting, there were also discussions with regards to tools.  Plugged-in ones in particular. The tools would facilitate experimentation by interested libraries with the library's own data. Both LC and OCLC alluded to a possibility of providing such tools for libraries to transform and examine their data. The tools may be mounted onto respective infrastructures (OCLC or LC)?! Perhaps we will hear more updates at the Midwinter or little before then?!

Thirdly, the ability to dig deeper into the data and parse out the relationship as you pointed out. There were a great deal of discussions on the concept of relationship, its definition, how best to parse the data in the Bibframe environment. Collection relationship (naturally with-in/out a resource, such as serials, series, In analytics, bound-with, even, TOC?, etc.) for which libraries have taken different approaches to express them in MARC.  In GW's data, I found many older data coded in a general note without subfield (such as Bound with).  But, with consistent use of phraseology for certain type of material and punctuations, it helped when transforming MARC data to BF data. Consistent punctuation most certainly helps the transformation of data from current encoding environment to Work, Instance environment.  The transforming exercise during the experiment helped to strategize prepping of data too.  

Subgroups were tasked to work on various aspects. This point paper on relationship hopefully may help us think through the relationships on various levels as related to the BF environment of Work and Instance.

Lastly, the discussion of new data to be created in BF did not take up much real estate as the group has been trying to nail down the definition of Work and Instance, and how Authority related to the two elements. Your point of born-BF data is an important piece. Surely as the community move forward, a group of professionals such as you will see to this as it has definitely an implication and influence to establishing future workflow.

Thank you.


Jackie Shieh
Resource Description
George Washington University Libraries
2130 H Street, NW
Washington, DC 20052
jshieh @gwu.edu
Phone: 202.994.4366
Fax: 202.994.6376

-------- Original Message --------
Subject: [BIBFRAME] Thoughts on the direction of Bibframe
Date: Mon, 7 Jan 2013 00:50:43 +0000
From: Kelley McGrath <[log in to unmask]>
To: [log in to unmask]

I thought I would share some thoughts I've had about the recent exchanges (or lack thereof) on the Bibframe list in case they're helpful to someone else. In another thread someone asked, "What is the point of the current exercise?" That question wasn't really satisfactorily answered that I can see.


In the month since the "early experimentation code" to translate MARC to Bibframe was made available, there has been little substantive discussion. It seems to me that the biggest reason is that this process has effectively disenfranchised a huge percentage of the potential audience who lack the time or skills or inclination to set up their own tool. I would like to give a shout out to Karen Coyle for putting up a few examples for the rest of us. It seems to me that putting up even 10 or 12 well-chosen examples could have stimulated a lot of discussion. There's also the question of making the display accessible. I can more-or-less follow what Karen put up, but I suspect it will be harder for many catalogers. Just as the end user needs a pretty display so too will there need to be a human-friendly display of Bibframe for the cataloger. Sample records in such a display would probably be reassuring to many.


Alternatively, catalogers would get a lot from a plain table showing what's mapped from MARC and to where, as well as what's not being mapped. Something like what RDA did: http://access.rdatoolkit.org/document.php?id=jscmap2.


On a more fundamental level, I wonder why we are not only starting by testing transformation tools (which would seem to me to come near the end), but why we are starting with transformations at all. Of course, it's essential that MARC translate into Bibframe in some useful fashion, but it makes more sense to me to start the discussion with questions like "What should Bibframe do?" I find that with what I have seen so far, I am somehow missing the big picture, the shape of Bibframe. To me, the most important question is not "How do we translate MARC to linked data/RDF?" but "What should Bibframe look like to do as much as possible of what we want to do?" We would benefit from a more thorough analysis of what MARC is really doing now, such as the work that Karen Coyle describes in her article MARC21 as Data: A Start (http://journal.code4lib.org/articles/5468), as well as a list of current complaints and desires. What is MARC doing and what is it not doing that we want it to do—from the answers to these questions we should decide what Bibframe will do.


I am very interested in the potential of Bibframe to deal with things that MARC doesn't handle well (like anthologies and other items that include multiple works). How would you put data into it if you were doing it from scratch? How can we make Bibframe an improvement on MARC and what it can do? To take an easy example, once we are freed from the constraints of letters and numbers for subfields, we should do better than the woefully inadequately tagged:


245 00 $a Library = $b Bibliothek ; Every book its reader / $c directed by John J. Smith.  Cataloging is fun : a short / directed by J. Johnson, Jr. and Anna Allen ; produced by Jane Jones.


Something like this would seem better (clearly this it totally not how you would construct this, but I think you can see the point of this quick example):


<first work>

<title> Library</title proper>

<parallel title> Bibliothek </parallel title>

<statement of responsibility> directed by John J. Smith </statement of responsibility>

</first work>


<second work>

<title> Every book its reader </title proper>

<statement of responsibility> directed by John J. Smith </statement of responsibility>

</second work>


<third work>

<title> Cataloging is fun </title proper>

<other title information> a short </other title information>

<statement of responsibility> directed by J. Johnson, Jr. and Anna Allen </statement of responsibility>
<statement of responsibility> produced by Jane Jones </statement of responsibility>

</third work>


perhaps you could even have something like:


<statement of responsibility>

<function statement> directed by </function statement>

<name statement> J. Johnson, Jr. </name statement>

<conjunction> and </conjunction>

<name statement> Anna Allen </name statement>

</statement of responsibility>


Do we need parallel tracks where born-Bibframe records look a little different from translated-from-MARC records? Since it is unreasonable to expect a computer to reliably parse things like the above 245 field on the necessary scale, we'll need something to do with things like "$b Bibliothek ; Every book its reader." Perhaps some intermediate step, such as labeling it "<additional title info (other title info, parallel title; additional titles by same authors)>" and then cleaning it up later (if at all), would work. In most cases, a better job could be done with 245 $b conversion by taking into account punctuation, but I see little hope for $c.


I am also interested in the extensibility and hospitality of Bibframe. Despite all the complaints about the complexity of MARC, there are still areas where it lacks desirable granularity. In addition to the above lack of subfields in 245, there was recently a discussion on the OLAC list about the conflation of subtitles, captions and intertitles in $j. The sense was that at least that last should be coded separately and I think a case could be made for the first two as well. I suspect there are many such things lurking.