Print

Print


Thomas,

Fragmentation in a linked data context doesn't seem like such a bad thing,
as long as things can be aligned conceptually. In this regard, it's
encouraging to see that the Library of Congress and OCLC are actively
working to align BIBFRAME and schema.org. However, there doesn't seem to be
a parallel discussion about aligning BIBFRAME and the RDA linked data
element set. It's been said that many of BIBFRAME's design choices were
made to accommodate RDA (the content standard) rather than MARC (the
encoding format), and yet there still seem to be some inconsistencies.

Take the BF Authority class, for example. The recent OCLC/LC "Common
ground" document states the following: "people, places, and organizations,
which are typically described in library authority files, are represented
not as curated strings or concepts but as real world objects in the LC and
OCLC models"[1]. If that is the case, why does BF still model "Authority"
as the superclass of Agent? Isn't an "authority," as traditionally defined,
a curated string? The BF vocab explicitly defines Authority as a
"representation of a key concept or thing"[2].

This is actually not the case with the RDA element set, where the
corresponding top-level class is actually Agent[3]. So, is bf:Agent
conceptually equivalent to rdac:C10002 ("Agent") if the former has been
modeled as a representation of an Agent?

Tim

[1]
http://www.oclc.org/research/publications/2015/oclcresearch-loc-linked-data-2015.html
[2] http://bibframe.org/vocab/Authority.html
[3] http://www.rdaregistry.info/Elements/c/


On Wed, Feb 4, 2015 at 4:17 AM, Meehan, Thomas <[log in to unmask]> wrote:

>  I wonder if seeking a Single Answer is not necessarily the best way
> forward anyway. Much of Bibframe’s problems seem to come from trying to
> keep everything from the past (MARC) while moving to something fit for the
> future, an ambition that has I think also afflicted RDA. Both RDA and
> Bibframe also seek to unite all formats, all types of collection, and all
> types of library under one format. Bibframe further aims to be “the
> foundation for the future of bibliographic description that happens on the
> web and in the networked world”[1]. Given that bibliographic description
> doesn’t just happen in libraries (and within libraries also happens outside
> the MARC cataloguing section), its own ambition seems even wider, and this
> essentially under the aegis of a single library. Existing linked data
> projects that have  published data (BL, European Library, Oslo, Worldcat,
> presumably now also the NLM), appear to have selected or created what they
> have decided will work best for their specific purposes and got on with it;
> the RDA element set as linked data in some ways naturally[2] follows from
> the creation of the rules and element set itself. Would the dangers of
> fragmentation be better or worse than trying to create one monolithic
> solution, especially as time is slipping by?
>
>
>
> Thanks,
>
>
> Tom
>
>
>
> [1] http://www.loc.gov/bibframe/faqs/#q02
>
> [2] Although I appreciate the amount of work this must have involved to
> actually put together is considerably more than is implied by “naturally”.
>
>
>
> ---
>
>
>
> Thomas Meehan
>
> Head of Current Cataloguing
>
> Library Services
>
> University College London
>
> Gower Street
>
> London WC1E 6BT
>
>
>
> [log in to unmask]
>
>
>
> *From:* Bibliographic Framework Transition Initiative Forum [mailto:
> [log in to unmask]] *On Behalf Of *Karen Coyle
> *Sent:* 03 February 2015 19:11
> *To:* [log in to unmask]
> *Subject:* Re: [BIBFRAME] Have your MARC and link it too (was 2-tier
> BIBFRAME)
>
>
>
> Tim, +1 ... yet...
>
>
> When first announced, it seemed that BIBFRAME *could* be the future
> bibliographic framework, similar to what you describe below. Then I noticed
> that although the original BIBFRAME model document [1] from November 2012
> does not mention MARC, it is named: marcld-report-11-21-2012.pdf. "marcld"
> *sigh*
>
> What we *should* do, from a data modeling point of view, is much clearer
> than understanding why it isn't happening that way, other than too many
> separate bailiwicks.
>
> The cataloging rules (RDA) claim to be technology independent; the
> technology developments (FRBR, BIBFRAME) claim to be cataloging rule
> neutral. None of them are as neutral as they claim to be, by a long shot.
> It can't be a coincidence that they all make reference to a title statement
> of responsibility (not in those exact words). Yet each is a separate
> project, as if they aren't potentially all part of a whole.
>
> There are a number of separate library linked data projects going on, each
> pursuing a different goal. The National Library of Spain has announced that
> it will not adopt RDA (but it has already developed a linked data-based
> catalog). NLM has announced that it will develop its own, non-MARC-ish
> version of BIBFRAME. RDA has its own linked data vocabulary and beta data
> creation tool.
>
> This is all a Gordian knot that must be untangled, and quite honestly I
> wouldn't know where to begin.
>
> kc
> [1] http://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf
>
>  On 2/3/15 7:37 AM, Tim Thompson wrote:
>
>    All,
>
> Here's a thought (apologies in advance if this message is naive, either
> from a technical or practical standpoint--it lacks detail and does nothing
> to provide specific recommendations--but here goes nothing anyway).
>
> Why not make BIBFRAME about the future of bibliographic data rather than
> its past? Libraries have invested tremendous resources over the years to
> produce and share their catalog records; it is only natural that they (and
> the catalogers who have worked so hard to encode those records--and to
> master the arcane set of rules behind their creation) would want to
> preserve that investment. Ergo the desire to devise a lossless (or nearly
> lossless) crosswalk from MARC to an RDF vocabulary (i.e., BIBFRAME). For
> years, libraries have been driven by just-in-case approaches to their
> services (certainly in the acquisition of new materials). But when we're
> dealing with data, do we really need to follow the same costly pattern?
> Rather than spending additional time and resources to attempt the quixotic
> task of converting all of MARC into actionable linked data (just in case
> we might need access to the contents of some obscure and dubiously useful
> MARC field), why not embrace a just-in-time approach to data conversion?
>
> As Karen has pointed out here, MARC records are structured as documents:
> much of our access to their contents comes through full-text keyword
> searching. Now, we already have a standardized way to encode data-rich
> documents: namely, XML. The MARCXML[1] format already gives us a lossless
> way to convert our legacy data into an interoperable format. And the W3C
> has spent the last 15 years developing standards around XML: XQuery 3.1[2]
> and XSLT 3.0[3] are now robust functional programming languages that even
> support working with JSON-encoded data. Needless to say, the same kind of
> ecosystem is not available for working with binary MARC. Next-generation
> Web application platforms like Graphity[4] and Callimachus[5] utilize the
> XML stack for conversion routines or as a data integration pipeline into
> RDF linked data. The NoSQL (XML) database MarkLogic (which I believe the
> Library of Congress itself uses) now includes an integrated triplestore.
> Archives-centric tools like Ethan Gruber's xEAC[6] also provide a hybrid
> model for leveraging XML to produce linked data (as an aside: leveraging
> XML for data integration could promote interoperability between libraries
> and archives, which continue to rely heavily on XML document
> structures--see EAD3[7]--to encode their data).
>
> So, why not excise everything from BIBFRAME that is mostly a reflection of
> MARC and work to remodel the vocabulary according to best practices for
> linked data? We can store our legacy MARC data as MARCXML (a lossless
> conversion), index it, link it to its BIBFRAME representation, and then
> access it on a just-in-time basis, whenever we find we need something that
> we didn't think was worth modeling as RDF. This would let BIBFRAME be the
> "glue" that it is supposed to be and would allow us to draw on the full
> power of XQuery/XSLT/XProc and SPARQL, together, to fit the needs of our
> user interfaces. This is still a two-tiered approach, but it does not
> include the overhead of trying to pour old wine into new wineskins
> (terrible mixed metaphor, but couldn't resist the biblical allusion).
>
> This kind of iterative approach seems more scalable and locally
> customizable than trying to develop an exhaustive algorithm that accounts
> for every possible permutation present in the sprawling MARC formats.
>
>
> Similar suggestions to this may have already been made on this list, but I
> think it's at least worth reviving the possibility in the context of the
> current thread. In short: we could extract the essence from our legacy
> bibliographic records, remodel it, and then, from here on out, start
> encoding things in new ways, without being beholden to an outmoded standard
> and approach. All the old data would still be there, and would be
> computationally tractable as XML, but our new data wouldn't need to be
> haunted by its ghost.
>
> Tim
>
>
> [1] http://www.loc.gov/standards/marcxml/
> [2] http://www.w3.org/TR/xquery-31/
> [3] http://www.w3.org/TR/xslt-30/
> [4] http://graphityhq.com/
> [5] http://callimachusproject.org/
> [6] https://github.com/ewg118/xEAC
> [7]
> http://www2.archivists.org/groups/technical-subcommittee-on-encoded-archival-description-ead/ead3-gamma-release
>
>   --
> Tim A. Thompson
> Metadata Librarian (Spanish/Portuguese Specialty)
> Princeton University Library
>
>
>
>  --
>
> Karen Coyle
>
> [log in to unmask] http://kcoyle.net
>
> m: +1-510-435-8234
>
> skype: kcoylenet/+1-510-984-3600
>
>