Print

Print


"And with linked data, I am very skeptical about the usefulness of mixing
content data with our directional data"

I often hear that mixing content data (full text) with directional data (I
understand this as descriptions about full text) lead to bad results.

The question is why Google started a Google Book Scan project and invested
many millions of dollars instead of relying merely on the catalog data
librarians compiled for over hundred years and handed over to them? Google
have had access to library catalogs, and it seems the catalogs are in such
a bad shape for the Web they do not appear in Google's services until
today. Google is just an example.

We know about the importance of text & data mining (TDM). Librarians want
to have the right for unrestricted TDM. Only "full text" can give the full
scope for contextualization about what information is stored in libraries.
Without context, information is useless. The challenge is that today's
catalogs still follow the model of ancient Callimachus' pinakes, codes for
inventory lists, designed for librarians, often lacking contextual
information and public exposure. Good for identifying items by scholars and
experts, not so good for patrons who are looking for extra knowledge and
services by linking library items to the Web. With RDF, all kinds of
statements / assertions can be recorded: assertions about things in the
full text, or in the descriptions, or things on the Web, or descriptions
about things on the Web.

Jörg



On Sun, Mar 8, 2015 at 8:07 PM, James Weinheimer <[log in to unmask]
> wrote:

> On 3/7/2015 9:37 PM, Martynas Jusevičius wrote:
>
>> I find these statements hard to believe. Data is just data. Data,
>> metadata - there is no difference.
>>
>> People are using RDF to describe proteins, semiconductor products,
>> horoscope signs, antique coins and who knows what else. What makes you
>> think libraries are special? Again, I mean real technical limitations
>> -- all the history and the "traditional ways of doing things" are
>> irrelevant here.
>>
>
> There are different types of data, and we experience it in all kinds of
> ways every day. I have gone into greater detail in those podcasts and
> presentations I mentioned, but I'll try to redo a little of it here. The
> differences are subtle, but clear.
>
> Before I begin however, what you have claimed to be history, and
> traditional ways of doing things, is not history at all. Whether we like it
> or not, what I described is the way libraries still work. It is what users
> are supposed to do when they use a library, and if people don't do it, they
> will get bad results. Of course, few people do it and this explains a lot
> of the frustration that people currently have with library catalogs.
>
> The solution that libraries have tried is called "information literacy"
> and "bibliographic instruction" which, instead of fixing library tools to
> work in a modern environment, means to teach everybody how to use our tools
> the way they are. In my own opinion, this hasn't worked and everything
> needs to be rethought, but what I described is not history--unfortunately
> it is still happening today.
>
> About catalog data, it isn't that it is special, but it is different from
> the other types of data that you point out. When someone comes to a
> library, they don't come specifically to search the catalog (or at least,
> those that do are exceedingly rare). Instead, the vast majority are there
> because they have a question and want information. My example has been
> "What were the causes of the War of the Spanish Succession". The catalog
> does not contain the information I want--the information that can answer my
> question is contained in the books, journal articles, and other materials
> in the collection--but if I use the catalog correctly, it can direct me to
> the resources that have the information I want. In this way, the
> information found in a catalog is similar to information found on ...
> traffic signs.
>
> If you want to drive from Rome to Paris, you need signs to help you get
> there. The better the signs, the better, the easier, and the more enjoyable
> the trip. Poor signs, or the absence of them (which happens in Italy all
> the time), can lead to frustration, anger or even disasters.
>
> So, people want and need decent and reliable road signs, but they are very
> rarely interested in the signs themselves: who made them, where and when,
> what materials they are composed of and so on. Still, those in charge of
> the road signs need to know that information, so that they can replace
> them, update them, add to them, etc.
>
> Using this same reasoning with catalogs and how things are changing,
> compare this with the person who is interested in the "War of the Spanish
> Succession" and searches the library catalog. They can sit there quite
> literally, all day long and not have learned anything about the War. All
> they see are *catalog records* and if they are to learn about the war
> itself, they need to get into the books of the collection. But when they
> search Google, in just a half-an-hour they have gotten some real
> information. This leads them to expect that library tools will work similar
> to what works (apparently) so easily and simply on Google, which seems
> logical but is completely wrong.
>
> Google works with a different type of information: content; library
> catalogs work by giving people directional information: so even when the
> searcher does everything correctly, all they see are directions: for
> general books on the War, look here, For books on the politics look there,
> For battles, look here, etc.
>
> For those who use catalogs incorrectly, they are practically doomed to
> disaster and for them it is similar to a driver who hasn't seen a road sign
> for hours, and ends up at the end of a road in the middle of a field at
> midnight.
>
> Believe me, this happens to students all the time when they are
> researching their papers at the last minute! Both end up in tears and/or
> almost screaming.
>
> Catalogers see this difference in information clearly because they work
> with the actual materials that people want: the books, the recordings, the
> maps, etc. all go through their hands. The mistake that many catalogers
> make (again in my opinion) is that they believe people, who care about the
> information in the collection (i.e. who want to learn about The War of the
> Spanish Succession), also care about the catalog records they make. Of
> course for the public, these records are the equivalent of road signs that
> help them get where they want to go. They don't care about the road signs
> and once they reach their destination, they completely forget about all the
> helpful road signs. I confess I remember only the frustrations and anger
> during the trips that had lousy signs. I think the same thing happens with
> catalog records.
>
> While our methods still "work" in a sense, they are strange for people in
> the 21st century. They need to be, in a sense, translated so that they work
> in today's environment.
>
> So, all data is definitely not equal. I think there is still a need for
> our type of data but it needs to be reconsidered. Tools that work well for
> content data, don't work so well with directional data. And with linked
> data, I am very skeptical about the usefulness of mixing content data with
> our directional data. Nevertheless, we should try it, to find out what
> happens. I would be very happy to be proven wrong.
>
> There are other options, too.
>
>
> James Weinheimer [log in to unmask]
> First Thus http://blog.jweinheimer.net
> First Thus Facebook Page https://www.facebook.com/FirstThus
> Personal Facebook Page https://www.facebook.com/james.weinheimer.35
> Google+ https://plus.google.com/u/0/+JamesWeinheimer
> Cooperative Cataloging Rules http://sites.google.com/site/
> opencatalogingrules/
> Cataloging Matters Podcasts http://blog.jweinheimer.net/
> cataloging-matters-podcasts
> The Library Herald http://libnews.jweinheimer.net/
>
> [delay +30 days]
>