I can't argue against the fact that linked data holds no immediate promises for making things easier to do with our data. There is a non-insignificant chance that migrating to a linked data will result in no real net benefit. There's really no way to know right now.
However, I can say with some confidence that we're near the logical ceiling of what we can do with our current data and cataloging practices. You say that *librarians* understand the MARC codes as if there aren't a whole slew of really smart developers who also understand them (a whole lot of them are even librarians!): the issue isn't understanding the data; it's getting a computer to understand the data.
There are *some* ways to relate *some* of our data to external data: standard identifiers (LCCN, OCLC numbers, ISBN, ISSN, BNB, etc.), controlled headings (subjects, titles, names, etc.), geographic codes, language codes, etc., although much of this is merely linking to other library data. It also assumes that it's actually been entered in the first place or is correct (which, certainly, is a problem with *any* data format).
Your Lucene argument in favor of MARCXML is, unfortunately, not terribly compelling: first off, those systems rarely use MARCXML (they generally use binary MARC if they're indexing catalog data, since it's much smaller); they have to do a *ton* of pre and post processing to strip out the AACR2-isms; librarians are constantly railing against them for "dumbing down the catalog"; and, most importantly, what do they bring that literally any other modern search interface doesn't already bring?
Forget about grouping things by publisher or location of the subject material (unless it's specifically *about* a place). Forget about tying in enrichment data for things that aren't books with an ISBN. Forget about limiting to things that are used in particular course or department syllabi/reading lists.
None of this means we throw that data away or that it holds no value. It's just not enough on its own, and libraries simply don't have enough resources to make it do the things that we could get by linking it out into the larger world.
On Fri, Mar 6, 2015 at 1:47 PM James Weinheimer <[log in to unmask]
Ross Singer wrote:
Counterpoint: if libraries can do "anything they want" with their data and
have had 40+ years to do so, why haven't they done anything new or
interesting with it for the past 20?
How, with my MARC records alone, do I let people know that they might be
interested in "Clueless" if they're looking at "Sense and Sensibility"? How
do I find every Raymond Carver short story in the collection? The albums
that Levon Helm contributed to? How can I find every introduction by Carl
Sagan? What do we have that cites them?
How, with my MARC records alone, can I definitively limit only to ebooks?
What has been published in the West Midlands?
You *could* make a 3-D day-glo print of a MARC record, I suppose - but that
seems like exactly the sort of tone deaf navel gazing that has rendered our
systems and interfaces more and more irrelevant to our users.
Why haven't libraries done anything new or interesting with our data for
the past 20 years? Is it because it has been *impossible* due to our
formats, even though we now have XML? You ask an excellent and important
question that I was hoping somebody would bring up. It deserves a
separate discussion. But first I want to emphasize: I am not saying that
we need to work with MARC records alone--never said that at all. What I
am saying is that for the library community, that is, the people who
already know and understand--and even control--MARC format, changing the
format they already control to Bibframe will not give them any new
capabilities over what they have been able to do with MARCXML.
*Librarians* understand the MARC codes and that means they can work with
MARCXML to fold in their records with what else exists on the Internet;
they can do that now, and they've been able to do it for awhile.
Changing to Bibframe/RDF will not change anything for librarians, but it
will change matters for non-librarians who may want to use our data for
their purposes. Nevertheless, a *lot* of work will remain to be done. It
isn't like after we change to Bibframe, we can fly onto the deck of the
aircraft carrier festooned with banners that proclaim "Mission
Accomplished". It will only be the beginning of a vast amount of work
and expense. It seems to me to make sense to talk about that now.
So, if we can already do anything and haven't, the obvious question is:
why will anything change with Bibframe/RDF? again, I stress: this
concerns *the library community*. Non-librarians will have new options
but there will not be any new capabilities for the library community.
Perhaps Bibframe will be a catalyst for change among librarians,
providing a needed kick-in-the-pants to get them to do something they
haven't until now. OK, I'd go along with that. But let's be fair and say
that it is just as possible that it won't. Going back to the reason why
we haven't done anything interesting in the last 20 years: maybe it's
money, maybe it's imagination, maybe it's proprietary catalogs, maybe
it's power.... I don't know, but there may be a whole host of other
Perhaps with Bibframe the non-librarian community will come riding to
the rescue and they will figure out what to do. We can hope.
I wrote that message on Autocat to combat the popular idea that the
reason libraries haven't done anything new or interesting is because of
the limitations of the format. That was true until MARCXML arrived and
then it became possible to do all sorts of new things. MARCXML may be
nasty and difficult to work with, but no matter: if somebody wants to,
it *can* be worked with *within the library community*. And people have
worked with it, such as we see in catalogs that utilize Lucene indexing
(which is based on MARCXML) to create the facets we see in different
library catalogs. (That is one thing that has been done in the last 20
years, and it is due to XML)
I gave the example of printing day-glo colors merely to emphasize that
we can currently do anything we want right now, but of course, I was not
suggesting we should waste our time on that. I want to try to open
people's minds to what *can* be possible. *Anything* is a tremendous
concept that is difficult to grasp. Once we accept and begin to
comprehend the idea that "anything can be done" the question of what
would be better, or worse, uses of our labor and resources becomes far
more complex and takes on different subtleties. Those who believe that
the problems we have faced are because of the *format* so therefore, the
solution is to get a "better format" and things will then be solved,
will be sadly disillusioned.
Finally, in answer to some other posts, I repeat once again that I am
FOR the library community's implementation of linked data but we need to
do it with our eyes open. I'll copy that part of my original message:
"I want again to emphasize that libraries should go into linked data,
but when we do so, there will probably be more question marks than
exclamation points. Just as when a couple is expecting a baby and they
experience pregnancy: at least when I experienced it, I imagined that
the birth of my son would be an end of the pregnancy. But suddenly, I
had a crying baby on my hands! Linked data will be similar: it will be a
beginning and not an end."
James Weinheimer [log in to unmask] First Thus
http://blog.jweinheimer.net First Thus Facebook Page
https://www.facebook.com/FirstThus Cooperative Cataloging Rules
http://sites.google.com/site/opencatalogingrules/ Cataloging Matters
Podcasts http://blog.jweinheimer.net/cataloging-matters-podcasts [delay