I'm reminded of this video.


The blue thing is the blue thing, regardless of whether the cat figures it out or not. Knowing a name or type is informative, but those bits of information can also be misleading if you take them too literally.


On Feb 3, 2015, at 3:31 PM, Murray, Ronald <[log in to unmask]> wrote:

Well: You just follow Alexander the Great’s lead – and cut it out.

Describe This: There’s a lot of discussion about technology old and new and the benefits it conveys as a technology. Thing are heating up.

But no one is describing anything that’s particularly challenging with their favorite technologies.

Thomas Kuhn used the term exemplars to mean problem-solution sets whose understanding and solution demonstrated mastery of a given scientific area. Note that this does not refer to the software engineering concept of a use-case. So: given any of the contending resource description technologies, describe these exemplars:

A Norton Critical Edition titled Eight Modern Plays – http://books.wwnorton.com/books/detail-contents.aspx?ID=11299

The Second Norton Critical Edition of Moby-Dick – http://books.wwnorton.com/books/detail.aspx?id=11008
Edited from enumerated numerous prior printings (without looking at the first Norton Critical edition)
Using a checklist of Moby-Dick editions to guide you: https://books.google.com/books?id=0YhBAAAAIAAJ&q=checklist+of+moby-dick+editions

Spring Dawn: Sign language calligraphy, created from an ASL interpretation of a poem by a seventh country Tang Dynasty poet: http://www.lapiak.com/media/index-book.php?media=springdawn. (Does your resource description theory allow for "ghost works” that must have existed but have left no physical remnants?) 

And good old Orson Whales, mashing up Melville with a tipsy and morose Orson Welles: http://vimeo.com/182925

Then we can move on to something that's challenging.

Ron Murray

From: Karen Coyle <[log in to unmask]>
Reply-To: Bibliographic Forum <[log in to unmask]>
Date: Tuesday, February 3, 2015 at 2:11 PM
To: Bibliographic Forum <[log in to unmask]>
Subject: Re: [BIBFRAME] Have your MARC and link it too (was 2-tier BIBFRAME)

Tim, +1 ... yet...

When first announced, it seemed that BIBFRAME *could* be the future bibliographic framework, similar to what you describe below. Then I noticed that although the original BIBFRAME model document [1] from November 2012 does not mention MARC, it is named: marcld-report-11-21-2012.pdf. "marcld" *sigh*

What we *should* do, from a data modeling point of view, is much clearer than understanding why it isn't happening that way, other than too many separate bailiwicks.

The cataloging rules (RDA) claim to be technology independent; the technology developments (FRBR, BIBFRAME) claim to be cataloging rule neutral. None of them are as neutral as they claim to be, by a long shot. It can't be a coincidence that they all make reference to a title statement of responsibility (not in those exact words). Yet each is a separate project, as if they aren't potentially all part of a whole.

There are a number of separate library linked data projects going on, each pursuing a different goal. The National Library of Spain has announced that it will not adopt RDA (but it has already developed a linked data-based catalog). NLM has announced that it will develop its own, non-MARC-ish version of BIBFRAME. RDA has its own linked data vocabulary and beta data creation tool.

This is all a Gordian knot that must be untangled, and quite honestly I wouldn't know where to begin.

[1] http://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf

On 2/3/15 7:37 AM, Tim Thompson wrote:
[log in to unmask]" type="cite">

Here's a thought (apologies in advance if this message is naive, either from a technical or practical standpoint--it lacks detail and does nothing to provide specific recommendations--but here goes nothing anyway).

Why not make BIBFRAME about the future of bibliographic data rather than its past? Libraries have invested tremendous resources over the years to produce and share their catalog records; it is only natural that they (and the catalogers who have worked so hard to encode those records--and to master the arcane set of rules behind their creation) would want to preserve that investment. Ergo the desire to devise a lossless (or nearly lossless) crosswalk from MARC to an RDF vocabulary (i.e., BIBFRAME). For years, libraries have been driven by just-in-case approaches to their services (certainly in the acquisition of new materials). But when we're dealing with data, do we really need to follow the same costly pattern? Rather than spending additional time and resources to attempt the quixotic task of converting all of MARC into actionable linked data (just in case we might need access to the contents of some obscure and dubiously useful MARC field), why not embrace a just-in-time approach to data conversion?

As Karen has pointed out here, MARC records are structured as documents: much of our access to their contents comes through full-text keyword searching. Now, we already have a standardized way to encode data-rich documents: namely, XML. The MARCXML[1] format already gives us a lossless way to convert our legacy data into an interoperable format. And the W3C has spent the last 15 years developing standards around XML: XQuery 3.1[2] and XSLT 3.0[3] are now robust functional programming languages that even support working with JSON-encoded data. Needless to say, the same kind of ecosystem is not available for working with binary MARC. Next-generation Web application platforms like Graphity[4] and Callimachus[5] utilize the XML stack for conversion routines or as a data integration pipeline into RDF linked data. The NoSQL (XML) database MarkLogic (which I believe the Library of Congress itself uses) now includes an integrated triplestore. Archives-centric tools like Ethan Gruber's xEAC[6] also provide a hybrid model for leveraging XML to produce linked data (as an aside: leveraging XML for data integration could promote interoperability between libraries and archives, which continue to rely heavily on XML document structures--see EAD3[7]--to encode their data).

So, why not excise everything from BIBFRAME that is mostly a reflection of MARC and work to remodel the vocabulary according to best practices for linked data? We can store our legacy MARC data as MARCXML (a lossless conversion), index it, link it to its BIBFRAME representation, and then access it on a just-in-time basis, whenever we find we need something that we didn't think was worth modeling as RDF. This would let BIBFRAME be the "glue" that it is supposed to be and would allow us to draw on the full power of XQuery/XSLT/XProc and SPARQL, together, to fit the needs of our user interfaces. This is still a two-tiered approach, but it does not include the overhead of trying to pour old wine into new wineskins (terrible mixed metaphor, but couldn't resist the biblical allusion).

This kind of iterative approach seems more scalable and locally customizable than trying to develop an exhaustive algorithm that accounts for every possible permutation present in the sprawling MARC formats.

Similar suggestions to this may have already been made on this list, but I think it's at least worth reviving the possibility in the context of the current thread. In short: we could extract the essence from our legacy bibliographic records, remodel it, and then, from here on out, start encoding things in new ways, without being beholden to an outmoded standard and approach. All the old data would still be there, and would be computationally tractable as XML, but our new data wouldn't need to be haunted by its ghost.


Karen Coyle
[log in to unmask] http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600