Tim, +1 ... yet...
When first announced, it seemed that BIBFRAME *could* be the future
bibliographic framework, similar to what you describe below. Then I
noticed that although the original BIBFRAME model document  from
November 2012 does not mention MARC, it is named:
marcld-report-11-21-2012.pdf. "marcld" *sigh*
What we *should* do, from a data modeling point of view, is much
clearer than understanding why it isn't happening that way, other
than too many separate bailiwicks.
The cataloging rules (RDA) claim to be technology independent; the
technology developments (FRBR, BIBFRAME) claim to be cataloging rule
neutral. None of them are as neutral as they claim to be, by a long
shot. It can't be a coincidence that they all make reference to a
title statement of responsibility (not in those exact words). Yet
each is a separate project, as if they aren't potentially all part
of a whole.
There are a number of separate library linked data projects going
on, each pursuing a different goal. The National Library of Spain
has announced that it will not adopt RDA (but it has already
developed a linked data-based catalog). NLM has announced that it
will develop its own, non-MARC-ish version of BIBFRAME. RDA has its
own linked data vocabulary and beta data creation tool.
This is all a Gordian knot that must be untangled, and quite
honestly I wouldn't know where to begin.
On 2/3/15 7:37 AM, Tim Thompson wrote:
[log in to unmask]" type="cite">
Here's a thought (apologies in advance if this message is
naive, either from a technical or practical standpoint--it
lacks detail and does nothing to provide specific
recommendations--but here goes nothing anyway).
Why not make BIBFRAME about the future of bibliographic
data rather than its past? Libraries have invested
tremendous resources over the years to produce and share
their catalog records; it is only natural that they (and
the catalogers who have worked so hard to encode those
records--and to master the arcane set of rules behind
their creation) would want to preserve that investment.
Ergo the desire to devise a lossless (or nearly lossless)
crosswalk from MARC to an RDF vocabulary (i.e., BIBFRAME).
For years, libraries have been driven by just-in-case
approaches to their services (certainly in the acquisition
of new materials). But when we're dealing with data, do we
really need to follow the same costly pattern? Rather than
spending additional time and resources to attempt the
quixotic task of converting all of MARC
actionable linked data (just in case we might need access
to the contents of some obscure and dubiously useful MARC
field), why not embrace a just-in-time approach to data
As Karen has pointed out here, MARC records are structured
as documents: much of our access to their contents comes
through full-text keyword searching. Now, we already have a
standardized way to encode data-rich documents: namely, XML.
The MARCXML format already gives us a lossless way to
convert our legacy data into an interoperable format. And
the W3C has spent the last 15 years developing standards
around XML: XQuery 3.1 and XSLT 3.0 are now robust
functional programming languages that even support working
with JSON-encoded data. Needless to say, the same kind of
ecosystem is not available for working with binary MARC.
Next-generation Web application platforms like Graphity
and Callimachus utilize the XML stack for conversion
routines or as a data integration pipeline into RDF linked
data. The NoSQL (XML) database MarkLogic (which I believe
the Library of Congress itself uses) now includes an
integrated triplestore. Archives-centric tools like Ethan
Gruber's xEAC also provide a hybrid model for leveraging
XML to produce linked data (as an aside: leveraging XML for
data integration could promote interoperability between
libraries and archives, which continue to rely heavily on
XML document structures--see EAD3--to encode their data).
So, why not excise everything from BIBFRAME that is mostly a
reflection of MARC and work to remodel the vocabulary
according to best practices for linked data? We can store
our legacy MARC data as MARCXML (a lossless conversion),
index it, link it to its BIBFRAME representation, and then
access it on a just-in-time basis, whenever we find we need
something that we didn't think was worth modeling as RDF.
This would let BIBFRAME be the "glue" that it is supposed to
be and would allow us to draw on the full power of
XQuery/XSLT/XProc and SPARQL, together, to fit the needs of
our user interfaces. This is still a two-tiered approach,
but it does not include the overhead of trying to pour old
wine into new wineskins (terrible mixed metaphor, but
couldn't resist the biblical allusion).
This kind of iterative approach seems more scalable and
locally customizable than trying to develop an exhaustive
algorithm that accounts for every possible permutation
present in the sprawling MARC formats.
Similar suggestions to this may have already been made on this
list, but I think it's at least worth reviving the possibility
in the context of the current thread. In short: we could
extract the essence from our legacy bibliographic records,
remodel it, and then, from here on out, start encoding things
in new ways, without being beholden to an outmoded standard
and approach. All the old data would still be there, and would
be computationally tractable as XML, but our new data wouldn't
need to be haunted by its ghost.
[log in to unmask] http://kcoyle.net