LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


BIBFRAME@LISTSERV.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  February 2015

BIBFRAME February 2015

Subject:

Re: 2-tier BIBFRAME

From:

"Young,Jeff (OR)" <[log in to unmask]>

Reply-To:

Bibliographic Framework Transition Initiative Forum <[log in to unmask]>

Date:

Sun, 1 Feb 2015 22:54:27 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (319 lines)

I agree that you can't avoid 2 tiers, but there are other ways to conceptualize those tiers. You suggest mapping to a 1st tier RDF vocabulary and then using a triplestore and SPARQL to do the reconciliation/cleanup into a 2nd vocabulary. Another alternative is to map directly into the target vocabulary and then use Map/Reduce (possibly using SPARQL on transient subgraphs) to do probabilistic matching and cleanup. The 2nd way doesn't require a 2 tier vocabulary.

Jeff

> On Feb 1, 2015, at 3:30 PM, Martynas Jusevičius <[log in to unmask]> wrote:
> 
> Jeff,
> 
> I don't think you can avoid 2 tiers, since the lower one is MARC and
> the higher one is Linked Data. What I'm saying is that by doing the
> mapping indirectly and within RDF, the implementation will be much
> more progressive. And I don't even need to know MARC to see that.
> 
> And you should definitely try out Dydra, which is a cloud triplestore:
> http://dydra.com
> 
>> On Sun, Feb 1, 2015 at 9:00 PM, Young,Jeff (OR) <[log in to unmask]> wrote:
>> Martynas,
>> 
>> I'm skeptical of the 2 tiered mapping approach, but I like the look of the tool. :-)
>> 
>> Jeff
>> 
>> 
>>> On Feb 1, 2015, at 8:57 AM, Martynas Jusevičius <[log in to unmask]> wrote:
>>> 
>>> Hey again,
>>> 
>>> I want to illustrate what I mean with the 2 tiers and the mapping
>>> between them with an example.
>>> 
>>> I used one of the data samples from Jörg's link (about "Aristotle on
>>> mind and the senses") and created a SPARQL query:
>>> 
>>> PREFIX field: <rdfmab:field#>
>>> PREFIX dct: <http://purl.org/dc/terms/>
>>> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
>>> PREFIX bf: <http://bibframe.org/vocab/>
>>> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
>>> 
>>> CONSTRUCT
>>> {
>>> ?work a bf:Work ;
>>>   dct:created ?createdDate ;
>>>   dct:modified ?modifiedDate ;
>>>   dct:language ?language ;
>>>   bf:title ?title ;
>>>   bf:creator [ a foaf:Person ; foaf:name ?creatorName ] ;
>>>   dct:subject [ a skos:Concept ; skos:prefLabel ?categoryLabel ] .
>>> }
>>> {
>>> ?work field:902__p|field:902__s ?categoryLabel .
>>> ?work field:002a_a ?createdString .
>>> BIND (STRDT(?createdString, xsd:date) AS ?createdDate)
>>> ?work field:003__a ?modifiedString .
>>> BIND (STRDT(?modifiedString, xsd:date) AS ?modifiedDate)
>>> ?work field:036a_a ?language .
>>> ?work field:101__a ?creatorName .
>>> ?work field:331__a ?title .
>>> }
>>> 
>>> You can try running it here:
>>> http://graphity.dydra.com/graphity/marc-test/sparql#marcrdf2bibframe
>>> 
>>> This query is a declarative, platform-independent mapping between 2
>>> "tiers" (levels of abstraction) of bibliographic data:
>>> 1. syntactic MARC-RDF, in this case specified by Jörg
>>> 2. conceptual real-world representation, as a mix of BIBFRAME, SKOS,
>>> Dublin Core and other relevant vocabularies
>>> 
>>> By no means I claim that this is an complete or semantically correct
>>> example, but I hope it gives a better idea of my suggestion.
>>> Feel free to expand and modify it.
>>> 
>>> Martynas
>>> graphityhq.com
>>> 
>>> On Sat, Jan 31, 2015 at 5:16 PM, [log in to unmask]
>>> <[log in to unmask]> wrote:
>>>> Because MARC is a key/value stream with a string-based encoding semantics,
>>>> this does not justify a "direct mapping" to RDF. This is problematic.
>>>> 
>>>> From an implementor's view, a correct migration to RDF means to parse MARC
>>>> records, map selected MARC-encoded values to functions/objects, evaluate
>>>> contextual information and more semantic information from other sources
>>>> (catalog codes, authority files), and let the mapping functions create RDF
>>>> graphs with the transformed information in it, adding datatype information,
>>>> links etc.
>>>> 
>>>> I invented something similar to your bfm proposal internally back in 2010.
>>>> 
>>>> https://wiki1.hbz-nrw.de/display/SEM/RDF-ISO2709+-+eine+RDF-Serialisierung+fuer+ISO+2709-basierte+bibliografische+Formate+%28MARC%2C+MAB%29
>>>> 
>>>> When asked for documenting the format, I hesitated and tried to describe it
>>>> as an intermediate serialization format and called it "RDF/ISO2709". But the
>>>> very bad side effect was that librarians who were not familiar with RDF and
>>>> the semantics of RDF thought RDF was just a wrapper mechanism like XML, they
>>>> called RDF a "format" and did not take it seriously as a modern graph model
>>>> for the bibliographic data of future catalogs.
>>>> 
>>>> Karen picked up the idea 2011 in
>>>> 
>>>> http://lists.w3.org/Archives/Public/public-lld/2011Apr/0137.html
>>>> 
>>>> So, from my personal experience, I do not recommend to propose a
>>>> MARC-centered "serialization only" Bibframe dialect. It will not improve
>>>> Bibframe or ease the migration, it will just add a truncated RDF without
>>>> links, without URIs, with another migration path.
>>>> 
>>>> If Bibframe can be seen as the "one-size-fits-all" RDF model for MARC, is
>>>> another question. For much of the data I have, Bibframe is not my first
>>>> choice.
>>>> 
>>>> Jörg
>>>> 
>>>> 
>>>> On Sat, Jan 31, 2015 at 1:14 PM, Martynas Jusevičius <[log in to unmask]>
>>>> wrote:
>>>>> 
>>>>> Jeff,
>>>>> 
>>>>> there is one reality, but it can be described in many different ways.
>>>>> And yes, there should be a separate RDF vocabulary for each level.
>>>>> 
>>>>> Here's a completely fictional example to illustrate what I mean:
>>>>> 
>>>>> 1. MARC-syntax level
>>>>> 
>>>>> _:record a bfm:Record ;
>>>>> bfm:recordType "Book" ;
>>>>> bfm:isbn "123456789" ;
>>>>> bfm:title "The Greatest Works" ;
>>>>> bfm:author1givenName "John" ;
>>>>> bfm:author1familyName "Johnson" ;
>>>>> bfm:author2givenName "Tom" ;
>>>>> bfm:author2familyName "Thompson" .
>>>>> 
>>>>> 2. Linked Data level
>>>>> 
>>>>> <books/123456789#this> a bld:Work, bldtypes:Book ;
>>>>> dct:title "The Greatest Works" ;
>>>>> bld:isbn "123456789" ;
>>>>> bld:authors (<persons/john-johnson#this> <persons/tom-thompson#this>) .
>>>>> 
>>>>> <persons/john-johnson#this> a foaf:Person, bld:Author ;
>>>>> foaf:givenName "John" ;
>>>>> foaf:familyName "Johnson".
>>>>> 
>>>>> <persons/tom-thompson#this> a foaf:Person, bld:Author ;
>>>>> foaf:givenName "Tom" ;
>>>>> foaf:familyName "Thompson".
>>>>> 
>>>>> 
>>>>> Both examples contain the same information, but it is encoded very
>>>>> differently. Clearly the Linked Data style is preferred, and the MARC
>>>>> vocabulary could in theory go away when there are no more legacy MARC
>>>>> systems to support.
>>>>> 
>>>>> I haven't seen any actual MARC data, but if someone has a simple
>>>>> example, we could work on that.
>>>>> 
>>>>> Martynas
>>>>> 
>>>>> 
>>>>> On Sat, Jan 31, 2015 at 4:21 AM, Jeff Young <[log in to unmask]>
>>>>> wrote:
>>>>>> Tim,
>>>>>> 
>>>>>> The semantics behind MARC is based on reality. MARC cares (may) too much
>>>>>> about which names and codes should be used in various structural
>>>>>> positions,
>>>>>> but there are real things lurking behind those.
>>>>>> 
>>>>>> Jeff
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Jan 30, 2015, at 9:58 PM, Tim Thompson <[log in to unmask]> wrote:
>>>>>> 
>>>>>> Karen,
>>>>>> 
>>>>>> Aren't the semantics behind MARC just the semantics of card catalogs and
>>>>>> ISBD, with its nine areas of bibliographic description? ISBD has already
>>>>>> been published by IFLA as a linked data vocabulary
>>>>>> (http://metadataregistry.org/schema/show/id/25.html)--although, sadly,
>>>>>> they
>>>>>> left out the punctuation ;-)
>>>>>> 
>>>>>> Tim
>>>>>> 
>>>>>> --
>>>>>> Tim A. Thompson
>>>>>> Metadata Librarian (Spanish/Portuguese Specialty)
>>>>>> Princeton University Library
>>>>>> 
>>>>>> On Fri, Jan 30, 2015 at 9:01 PM, Young,Jeff (OR) <[log in to unmask]>
>>>>>> wrote:
>>>>>>> 
>>>>>>> What if it was two different vocabularies, rather than two different
>>>>>>> levels of abstraction?
>>>>>>> 
>>>>>>> There is only one reality. A rose by any other name would smell as
>>>>>>> sweet.
>>>>>>> :-)
>>>>>>> 
>>>>>>> Jeff
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jan 30, 2015, at 8:02 PM, Martynas Jusevičius
>>>>>>>> <[log in to unmask]>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Karen,
>>>>>>>> 
>>>>>>>> lets call those specifications BM (BIBFRAME MARC) and BLD (BIBFRAME
>>>>>>>> Linked Data).
>>>>>>>> 
>>>>>>>> What I meant is two different levels of abstractions, each with its
>>>>>>>> own vocabulary and semantics. And a mapping between the two, for
>>>>>>>> which
>>>>>>>> SPARQL would be really convenient.
>>>>>>>> 
>>>>>>>> In the 2-tier approach, these are the main tasks:
>>>>>>>> 1. convert MARC data to RDF at the syntax level (BM)
>>>>>>>> 2. design semantically correct bibliographic Linked Data structure
>>>>>>>> (BLD)
>>>>>>>> 3. define a mapping from BM to BLD
>>>>>>>> 
>>>>>>>> So in that sense I don't think it is similar to profiles, as profiles
>>>>>>>> deal with a subset of properties, but they still come from the same
>>>>>>>> vocabulary.
>>>>>>>> 
>>>>>>>> A somewhat similar approach is W3C work on relational databases:
>>>>>>>> 1. direct mapping to RDF: http://www.w3.org/TR/rdb-direct-mapping/
>>>>>>>> 2. customizable declarative mapping to RDF:
>>>>>>>> http://www.w3.org/TR/r2rml/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Martynas
>>>>>>>> graphityhq.com
>>>>>>>> 
>>>>>>>>> On Fri, Jan 30, 2015 at 10:15 PM, Karen Coyle <[log in to unmask]>
>>>>>>>>> wrote:
>>>>>>>>> Martynas,
>>>>>>>>> 
>>>>>>>>> I agree that the requirement to accommodate legacy MARC is a
>>>>>>>>> hindrance
>>>>>>>>> to
>>>>>>>>> the development of a more forward-looking RDF vocabulary. I think
>>>>>>>>> that
>>>>>>>>> your
>>>>>>>>> suggest of using SPARQL CONSTRUCT queries is not unlike the concepts
>>>>>>>>> of
>>>>>>>>> selected views or application profiles -- where you work with
>>>>>>>>> different
>>>>>>>>> subsets of a fuller data store, based on need.
>>>>>>>>> 
>>>>>>>>> I wonder, however, how an RDF model designed "from scratch" would
>>>>>>>>> interact
>>>>>>>>> with a model designed to replicate MARC. I know that people find
>>>>>>>>> this
>>>>>>>>> to be
>>>>>>>>> way too far out there, but I honestly don't see how we'll get to
>>>>>>>>> "real"
>>>>>>>>> RDF
>>>>>>>>> if we hang on not only to MARC but to the cataloging rules we have
>>>>>>>>> today
>>>>>>>>> (including RDA). We'd have to start creating natively RDF data, and
>>>>>>>>> until we
>>>>>>>>> understand what that means without burdening ourselves with pre-RDF
>>>>>>>>> cataloging concepts, it's hard to know what that means.
>>>>>>>>> 
>>>>>>>>> All that to say that I would love to see a test implementation of
>>>>>>>>> your
>>>>>>>>> idea!
>>>>>>>>> 
>>>>>>>>> kc
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 1/30/15 9:03 AM, Martynas Jusevičius wrote:
>>>>>>>>> 
>>>>>>>>> Hey,
>>>>>>>>> 
>>>>>>>>> after following discussions and developments in the BIBFRAME space,
>>>>>>>>> it
>>>>>>>>> seems to me that it tries to be too many things for too many people.
>>>>>>>>> 
>>>>>>>>> I think many of the problems stem from the fact that (to my
>>>>>>>>> understanding) BIBFRAME is supposed to accommodate legacy MARC data
>>>>>>>>> and be the next-generation solution for bibliographic Linked Data.
>>>>>>>>> Attempting to address both cases, it fails to address either of them
>>>>>>>>> well.
>>>>>>>>> 
>>>>>>>>> In my opinion, a possible solution could be to have 2 tiers of RDF
>>>>>>>>> vocabularies:
>>>>>>>>> - a lower-level one that precisely captures the semantics of MARC
>>>>>>>>> - a higher-level one that is designed from scratch for bibliographic
>>>>>>>>> Linked
>>>>>>>>> Data
>>>>>>>>> 
>>>>>>>>> The conversion between the two (or at least from the lower to the
>>>>>>>>> higher level) could be expressed simply as SPARQL CONSTRUCT queries.
>>>>>>>>> 
>>>>>>>>> Any thoughts?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Martynas
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Karen Coyle
>>>>>>>>> [log in to unmask] http://kcoyle.net
>>>>>>>>> m: +1-510-435-8234
>>>>>>>>> skype: kcoylenet/+1-510-984-3600
>>>> 
>>>> 

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
July 2011
June 2011

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager