On 4/2/2015 7:52 PM, Kelley McGrath wrote:
> We will have to agree to disagree. It may be*easier* to get information out of MARCXML, but you can't get*more* out of MARCXML than out of binary MARC. If you add things into a MARCXML record that won't go back into binary MARC then it may be XML, but it isn't MARCXML anymore. MARCXML gives you a wider variety of tools that you can use to interact with the data, but it doesn't address any other limitations of MARC.
>
> My problem is that I want information that isn't easy to get at. Or at least I can't find an easy way to do it. Can you tell me (via algorithm and not human eyes) whether the contents of 245$b are a subtitle, a parallel title or title(s) that are part of a resource without a collective title? Can you find the titles that are hidden in 245$c?
That hasn't been my argument. My argument is that you can't get more out
of RDF/Bibframe than you can out of MARCXML, so long as you *know the
schema.* This means that if a developer understands the complexities of
MARC, they can do just as much with MARCXML as they can with
RDF/Bibframe. Most developers do not understand the complexities of
MARCXML, but they will be able to use Bibframe, which is supposed to be
more accessible to non-library developers.
Concerning your question about various types of titles, parallel,
subtitle, etc., if the information is not encoded separately, it cannot
be extracted separately, at least not without some additional work. To
see this in action, I took a MARCXML record with a parallel title
http://lccn.loc.gov/94009455/marcxml and put it into Bibframe. Here is
the 245 from the MARCXML:
<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Eleven short stories =</subfield>
<subfield code="b">Undici novelle /</subfield>
<subfield code="c">Luigi Pirandello ; translated and edited by Stanley
Appelbaum.</subfield>
</datafield>
and the parallel title is coded in a 740, not as a 246 11 (which would
display the parallel title note):
<datafield tag="740" ind1="0" ind2=" ">
<subfield code="a">Undici novelle.</subfield>
</datafield>
What came out of the Bibframe conversion was (look at the bottom line):
<bf:Title
rdf:about="http://bibframe.org/resources/YUI1428045287/1123585title29">
<bf:titleValue>Eleven short stories </bf:titleValue>
<bf:subtitle>Undici novelle </bf:subtitle></bf:Title>
<bf:Title
rdf:about="http://bibframe.org/resources/YUI1428045287/1123585title30">
<bf:titleValue>Undici novelle
</bf:titleValue><bf:titleType>parallel</bf:titleType></bf:Title>
I saw that it did come out labelled as a parallel title. I wondered why
so I checked the Bibframe site on github and found this
https://github.com/lcnetdev/marc2bibframe/blob/master/modules/module.MBIB-2-BIBFRAME-Shared.xqy.
In lines 3564+, (I won't copy the code), we see that it is digging out
the equals sign which denotes the parallel tile from the MARCXML 245
field e.g.
($d/marcxml:sbfield[@code="a"]),"=")
and if it finds it, then adds
element bf:titleType {"parallel"}
which outputs as
<bf:titleType>parallel</bf:titleType>
It looks like it works with the $b and $c too but I haven't tested it.
All very nicely done, but as we see, it can be done (is done) with
MARCXML and is an example of something that is too difficult for a
non-librarian developer to do. After Bibframe, it will be easier for
others to take it, if they want it. While I see nothing wrong with this
and am all for it, it is also something that libraries could have been
working with for a long time, since the beginning of MARCXML. You don't
have to have RDF if you know the schema. This illustrates my point.
From another viewpoint, it is also important to realize that this
represents no *additional access* for the user from what they have
always had, because the catalogers have made added title entries (246,
7xx$t, 730, 740) for all of these types of titles. We saw it in this
record which used the older practice of 740. Therefore, the computer
processing that finds the equals sign (=) does *not* create additional
access because people have always been able find it by searching the
extra title supplied by the cataloger. What the processing actually does
is translate the librarian's secret language (=) into the words
"parallel title".
I think it is worthwhile at this point to step back a moment and
reconsider: all of this is for the user, or in other words,
non-librarians. How do they understand "parallel title"? "Parallel
title" is a very library concept, and not even all libraries have the
concept of a "parallel title". For instance, the AGRIS model has
different titles for different languages, English title, French title,
Spanish title, etc. but not the precise concept of a "parallel title".
For an example of how it is treated, see http://bit.ly/1xHtSoS where one
item has three parallel titles, handled as three equal language titles.
In this sense, the AGRIS model has been more exact than ISBD-type
practices and I think that for a non-librarian, the AGRIS-type practice
is much more understandable than the more abstract ISBD concept of
"parallel title". I have no idea how a web developer, who may not have
stepped into a library for the last decade or so, would understand terms
such as "parallel title", "alternative title" "running title" much less
"work title" "expression title" and so on. Lots of librarians don't
understand the differences.
To consider further, does the public need these various titles labelled
so precisely, or do they just need the titles themselves? It seems to me
that the vast majority of searchers don't understand these distinctions
and anyway, they don't need these distinctions to find the information
they need. They just need assurance that the titles really are entered
into the record so that they can be found. Who cares if it's a parallel,
alternative, spine or whatever title? Catalogers care. A lot, and for
all kinds of reasons, but others, hmmm....
These are some of the issues that I have been hoping would be discussed,
but it would take broad participation--not just catalogers and IT
people, but public services especially, and regular users. People in
other bibliographic endeavors who have different bibliographic concepts
should be included. After all, everything is supposed to be linked now.
That includes their stuff. It will take many groups to find out what the
public(s) really need and want.
This is not the correct list for discussing these considerations, but
they should be discussed somewhere.
James Weinheimer [log in to unmask]
First Thus http://blog.jweinheimer.net
First Thus Facebook Page https://www.facebook.com/FirstThus
Personal Facebook Page https://www.facebook.com/james.weinheimer.35
Google+ https://plus.google.com/u/0/+JamesWeinheimer
Cooperative Cataloging Rules
http://sites.google.com/site/opencatalogingrules/
Cataloging Matters Podcasts
http://blog.jweinheimer.net/cataloging-matters-podcasts
The Library Herald http://libnews.jweinheimer.net/
[delay +30 days]
|