Hi Steven!
Thanks for your excellent answer! This was really helpful.
Steven Michael Folsom kirjoitti 27.11.2017 klo 22:39:
> I informally polled the PCC URIs in MARC Task Group, and those that responded agreed:
>
> - With the presence of a $t, the $0 should refer to a Work. (The converter is working as expected.*)
> o This would go for 700 $a$t and other fields as well.
That's what I suspected. Thanks for confirming.
> - There should not be a URI for the Author in that field as the author alone does not represent the entire subject of the work as defined by the cataloger. [The Task Group would like to promote a practice of not including $0 URIs that represent different objects from the (albeit implicit) objects of the “triple” in MARC. There’s a subgroup working on making clearer when certain subfields trigger different types of resources.]
> o Just to be clear, if the text was about the Author (without a $t), a $0 for the Author would be advisable.
> o If there are not URIs for the Work, there should not be $0 in the field. That shouldn’t stop the converter from creating one, but I think we can all agree stable/canonical Work URIs would be great. (
Right. I agree, this sounds sensible. So we should just strip the $0
subfields (with author id's) from the x00 records with $t that currently
have them.
Unfortunately this means that, since we don't have work IDs yet, we
don't have any ID to place instead in those x00 $0 subfields and the
work/author combination is only identified by the work title and the
author name, but without specifying the author ID. I can live with that
but it means I have to do some post-processing to merge these authors
with the ones that do have IDs. I've already implemented that in my
processing pipeline [1] but it means I have to rely on matching names
even though I could in theory have used person IDs instead.
> - *With respect to converters, perhaps when there is a URI in the $0, the converter should not assert a type on the resource. Rather, the RDF generated from the converter could just link to the resource, and not try to further describe it. In theory, the RDF description of the resource will include its own type assertions.
> o This assumes (as a colleague put it) the resource description isn’t too skimpy.
> o I’m not sure if this is true, but would this complicate your work to use the BF converter output as an intermediary to create schema.org data? For things that already have URIs, are you creating schema assertions about them that require knowing, for example, that something from the converter is a bf:Work?
Not sure this would be helpful. If there are reasonable conventions in
action here, I think the converter should also assert the type of the
resource, as it already does. So 600 with $t should become a Work and
600 without $t should become a Person. I think types like this are
helpful, and I use them as "anchors" for my conversion to Schema.org, so
losing them would make the conversion much more difficult. In many cases
I don't have any more information about those identifiers, so extracting
as much as possible from the MARC record is good. I think not asserting
the types would probably complicate other applications too.
-Osma
[1] https://github.com/NatLibFi/bib-rdf-pipeline/issues/77
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[log in to unmask]
http://www.nationallibrary.fi
|