Print

Print


First of all, very well said Roy!

I think we need to quickly get past the MARC21 conversion conversation 
and immediately start to understand what metadata management looks like 
when we have a clear focus on systems that serve the user's convenience 
in finding things that match their needs.  Let's stop the obsession with 
this or that format and recognize that we are going to have to 
experiment with lots of expressions of the data for different consumers 
we care about.

That doesn't contradict Roy's emphasis on entities; on the contrary it 
supports it--let's entify the data and then figure out the right 
channels to most effectively expose it.

I think a focus on this answers most of James' questions below.  I don't 
think there is a magical linked library catalog with lots of flowing 
connections from print to video to data sets--that's just incremental 
improvement on the tool that isn't used much anymore for discovery.

Building our data in such a way that it can be consumed on the web like 
the most useful way to get the most value out of library collections.  
We should assume that for the moment discovery happens in social 
networks and on the web.  The library catalog will remain a useful 
source for harvesting of data and as the system of record for 
inventory--so it can continue in its role of satisfying offers.

If we can expose library collection data appropriately library users 
will continue to use their preferred search engines and the catalog can 
continue to be an agent for fulfillment.  These are legitimate roles, 
but we can forget about building magical catalogs that will bring back 
unicorns to our midst.  ;-)

-Ted

> James Weinheimer <mailto:[log in to unmask]>
> February 22, 2016 at 8:33 AM
> On 2/19/2016 10:15 PM, Tennant,Roy wrote:
>> You created a plausible outline that I'm afraid is missing a rather 
>> large and important step. For the lack of a better term I'll call it 
>> "entification," which is what we call it around here.
> ...
>> I get the sense sometimes that the library community doesn't fully 
>> grasp the nature of this transition yet, and it worries me. We need 
>> to shake off the shackles of our record-based thinking and think in 
>> terms of an interlinked Bibliographic Graph. As long as we keep 
>> talking about translating records from one format to another we 
>> simply don't understand the meaning of linked data and both the 
>> transformative potential it has for our workflows and user interfaces 
>> as well as the plain difficult and time consuming work that will be 
>> required to get us there.
>>
>> Sure, we at OCLC are a long way down a road that should do a lot to 
>> help our member libraries make the transition, but there will be 
>> plenty of work to go around. The sooner we fully grasp what that work 
>> will be, the better off we will all be in this grand transition. No, 
>> let's call it what it really is: a bibliographic revolution. Before 
>> this is over there will be broken furniture and blood on the floor. 
>> But at least we will be free of the tyrant.
>
> I completely agree that the library community doesn't fully grasp the 
> nature of the transition. We are only at the beginning of a "long, 
> strange trip"--and the resources of some libraries (and librarians 
> themselves!) are almost exhausted already.
>
> All of this in the pursuit of a highly abstract goal: an interlinked 
> bibliographic graph. I haven't come across that term before, but I 
> guess it is a take on the "Giant Global Graph" of Tim Berners-Lee that 
> many people consider to be the ultimate goal of linked data. To 
> achieve this goal of an interlinked bibliographic graph, we see that 
> much will have to be sacrificed, but the revolution will be worthwhile 
> because we will be free of the "tyrant". Once again, I am not sure 
> precisely what you mean here, but I assume the tyrant is the MARC 
> record, which is a "unified bibliographic record" that contains all of 
> the information for a bibliographic item.  (I prefer to call it the 
> "unit record" or the traditional catalog card, which was made to deal 
> with the 19th-century transition from the earlier book catalogs, which 
> were structured quite differently)
>
> The unified bibliographic record found in MARC must undergo 
> "entification," which again, I assume means to turn as much as 
> possible of the current, unified bibliographic record into entities, 
> i.e. URIs, that in turn can be linked to--by anyone, I guess. (that 
> is, if it is to be linked OPEN data. Linked closed data is an entirely 
> different matter) In any case, if all this is done, I completely agree 
> that the data that is now in our bibliographic records will become 
> almost infinitely flexible.
>
> There are a few questions of course. Chief among them, the obvious one:
>
> 1) Is this what libraries signed up for? What will be the final costs 
> in terms of budgets, careers, redoing so much yet again? And how long 
> will it take?
>
> 2) It remains to be seen whether any of this is what the public wants. 
> I guess I'm just an old-fashioned kind of guy, or maybe just naive, 
> but it seems to me that when people come to a library (either 
> virtually or physically) they come to use the items in the collection, 
> and not to use the catalog. In other words, people do not come to a 
> library, or the library's website, just to look up something in the 
> catalog and then.... go home. They use the catalog to get into the 
> materials in the library's collection. If they already know what they 
> want and where it is, they ignore the catalog. (Maybe they shouldn't 
> but they do)
>
> The best catalogs are those that I can use as quickly and as easily as 
> possible so that I can spend the least amount of time with the catalog 
> and spend the most amount of time in the items I find in the 
> collection. This is why I personally prefer Google. It is not that I 
> spend a great deal of time on Google, but paradoxically, I spend the 
> *least* amount of time there compared to the other search engines. 
> That's why I prefer it.
>
> So, even if we make the "100% entified, interlinked bibliographic 
> graph tool" that brings in information from hither and yon, that gives 
> me charts from the IMF and images from Flickr, videos from YouTube, 
> the latest news from Bing, plus of course, all the Wikipedia info, 
> along with the library materials--and I'll assume here that it will 
> even be on the specific topics I want, that might be great. Pardon my 
> skepticism: I think lots of people would still like to see it in 
> action before concluding that it really is great.
>
> It may be that the idea is to get rid of or replace the catalog 
> completely, but I think the public will continue to demand a quick and 
> easy-to-use list to get into the materials in a library's collection. 
> The proposed linked data tools do not provide this but only adds 
> complexity to the catalog by adding more and more stuff into a search 
> result. It seems to me that we can entify things until Doomsday and it 
> still won't make it one bit easier for the public to find materials in 
> library collections.
>
> The problem is: our catalogs have never been easy-to-use, and they 
> blew out even worse when they went online with keyword. There are tons 
> of problems and those issues have yet to be addressed. But just 
> because the public doesn't like to use library catalogs doesn't mean 
> that they do not want a "listing of materials" in the collection they 
> are using. And that list should be made as simple to use as possible. 
> Such a listing  is also called a catalog. A lot could be done to make 
> it easier to use than it is today. But nobody seems to be talking 
> about that.
>
> But maybe I'm wrong. Maybe the public doesn't want an easy-to-use 
> listing of materials in a library's collection. Like I said, maybe I'm 
> just an old-fashioned kind of guy, or just naive.
>
> James Weinheimer [log in to unmask]
> First Thus http://blog.jweinheimer.net
> First Thus Facebook Page https://www.facebook.com/FirstThus
> Personal Facebook Page https://www.facebook.com/james.weinheimer.35
> Google+ https://plus.google.com/u/0/+JamesWeinheimer
> Cooperative Cataloging Rules 
> http://sites.google.com/site/opencatalogingrules/
> Cataloging Matters Podcasts 
> http://blog.jweinheimer.net/cataloging-matters-podcasts
> The Library Herald http://libnews.jweinheimer.net/
>
> [delay +30 days]
> Tennant,Roy <mailto:[log in to unmask]>
> February 19, 2016 at 4:15 PM
> Eric,
> You created a plausible outline that I'm afraid is missing a rather 
> large and important step. For the lack of a better term I'll call it 
> "entification," which is what we call it around here. This might 
> encompass the creation of your own linked data entities or the use of 
> those created by others (such as, dare I say it, OCLC). In other 
> words, Step 5 is deceivingly simple when in fact it is devilishly 
> complex.
>
> We witnessed this recently when we took a look at some BIBFRAME 
> records produced by a large research university and they were punting 
> on the entification. That is, by simply taking records in MARC and 
> translating them to BIBFRAME in a one-to-one operation, you are 
> basically left with a BIBFRAME record that really isn't linked data at 
> all. You have assertions that are basically meaningless, as they link 
> to nothing and nothing links to them. How many URIs do you think 
> Washington, DC should have? I would argue one, at the very least 
> within your own dataset, but that isn't what you end up with without 
> taking a great deal of time and trouble to do the entification step -- 
> whether using your own data or reconciling your data against someone 
> else's entities, such as LCSH.
>
> I get the sense sometimes that the library community doesn't fully 
> grasp the nature of this transition yet, and it worries me. We need to 
> shake off the shackles of our record-based thinking and think in terms 
> of an interlinked Bibliographic Graph. As long as we keep talking 
> about translating records from one format to another we simply don't 
> understand the meaning of linked data and both the transformative 
> potential it has for our workflows and user interfaces as well as the 
> plain difficult and time consuming work that will be required to get 
> us there.
>
> Sure, we at OCLC are a long way down a road that should do a lot to 
> help our member libraries make the transition, but there will be 
> plenty of work to go around. The sooner we fully grasp what that work 
> will be, the better off we will all be in this grand transition. No, 
> let's call it what it really is: a bibliographic revolution. Before 
> this is over there will be broken furniture and blood on the floor. 
> But at least we will be free of the tyrant.
> Roy Tennant
> OCLC Research
>
>
>
>
> Eric Lease Morgan <mailto:[log in to unmask]>
> February 19, 2016 at 3:15 PM
>
>
> Very interesting. Thank you, and based on this input, I’ve outlined a 
> possible workflow for creating, maintaining, and exposing 
> bibliographic description in the form of BIBFRAME linked data:
>
> 1. Answer the questions, "What is bibliographic
> description, and how does it help facilitate the goals
> of librarianship?"
>
> 2. Understand the concepts of the Semantic Web,
> specifically, the ideas behind Linked Data.
>
> 3. Embrace & understand the strengths & weaknesses of
> BIBFRAME as a model for bibliographic description.
>
> 4. Design or identify and then install a system for
> creating, storing, and editing your bibliographic data.
> This will be some sort of database application whether
> it be based on SQL, non-SQL, XML, or a triple store. It
> might even be your existing integrated library system.
>
> 5. Using the database system, create, store, import/edit
> your bibliographic descriptions. For example, you might
> simply use your existing integrated library for these
> purposes, or you might transform your MARC data into
> BIBFRAME and pour the result into a triple store.
>
> 6. Expose your bibliographic description as Linked Data
> by writing a report against the database system. This
> might be as simple as configuring your triple store, or
> as complicated as converting MARC/AACR2 from your
> integrated library system to BIBFRAME.
>
> 7. Facilitate the discovery process, ideally through
> the use of a triple store/SPARQL combination, or
> alternatively directly against integrated library
> system.
>
> 8. Go to Step #5 on a daily basis.
>
> 9. Go to Step #1 on an annual basis.
>
> If the profession continues to use its existing integrated library 
> systems for maintaining bibliographic data (Step #4), then the hard 
> problem to solve is transforming and exposing the bibliographic data 
> as linked data in the form of BIBFRAME. If the profession designs a 
> storage and maintenance system rooted in BIBFRAME to begin with, then 
> the problem is accurately converting existing data into BIBFRAME and 
> then designing mechanisms for creating/editing the data. I suppose the 
> later option is “better”, but the former option is more feasible and 
> requires less retooling.
>
> —
> Eric Lease Morgan
> Joy Nelson <mailto:[log in to unmask]>
> February 19, 2016 at 1:12 PM
> Eric-
> I am starting to explore this same issue.  It seems that there are two 
> (probably more) 'road humps' in the process of moving data from 
> marc/marcxml to RDF triples in Bibframe.  The first is the idea of 
> Garbage in/Garbage out.  If the data isn't clean to begin with the 
> transformation to triples will fail (remain as literals, not uris).  
> The first step in the process should probably involve cleaning the data.
>
> Secondly, the issue of what URI's to use.  Will your system create 
> it's own URI's to use for works?  Or will you reference an existing 
> URI at LOC.  The second benefits the LOC with 'pingback' but doesn't 
> benefit your institution.  It seems that you would be creating your 
> own URI's for works and instances and using existing URI's for things 
> like authors, publishers,etc.    But the choice of URI's will be an 
> issue.
>
> In our system we store marc as marc and as marcxml.  In my initial 
> thoughts into this process, I'm wondering if the system just needs to 
> become more 'agnostic' in the data format.  If I provide BIBFRAME in 
> RDF/XML then the system should be able to pull out the bits it needs 
> for display.  We would need some logic in the innerworkings to deal 
> with various types of XML data.  And using an indexer on the system 
> that can handle various XML formats would help in searching by users.  
> (I'm thinking Elastic Search here).   Right now I tend to think of the 
> BIBFRAME descriptions as distinct units that would be similar to a 
> marcxml record.  It is concievable to think that there would be an 
> additional layer on top that would store ALL the triples and use some 
> kind of SPARQL querying/searching???  I don't know about that yet.  An 
> ILS has need for relational database structure since it is 
> transactional.  But...could there be  component that is a graph 
> database???
>
> All in all, it's a fun concept to play around with in my head. Where 
> it ends up or how it looks will be an interesting journey.
>
>
> Joy Nelson
> Director of Migrations
> ByWater Solutions <http://bywatersolutions.com>
>
>
>
>
>
>
> Eric Lease Morgan <mailto:[log in to unmask]>
> February 19, 2016 at 12:10 PM
>
>
> Use case? Hmm… Okay. Say I’m a library who is convinced that BIBFRAME 
> is the way to go. How might I get from where I am with my MARC/AACR2 
> data to a discovery system rooted in a triple store and “kewl” SPARQL 
> front-end?
>
> Maybe I could put my question a different way. Of the folks who have 
> created sample implementations, what was the process used? [1] 
> Actually, I can (sort of) answer my own question by reading the 
> implementation descriptions. That said, I believe the process to 
> create and maintain new triples for new content is/will be a difficult 
> one.
>
> [1] sample implementations - 
> https://www.loc.gov/bibframe/implementation/register.html
>
> —
> Eric Morgan
> Still Lost In Philadelphia