Jorg, thanks for the show of "sameAs" linking. There will no doubt
be linking of that nature in our environment since obviously there
will be duplicate data that must be linked. It's a bit orthogonal to
the nature of my question, although in the end it does relate to it.
I suspect that my question is more relevant to the US cataloging
work-flow than to Germany's because we've had a more centralized
sharing of bibliographic data than the German libraries (as I
understand it).
The following explanation is background for my question, and is NOT
based on an assumption that you aren't aware of this. But I need to
say where my question, which probably needs to be in multiple parts,
is heading.
Today, with a few exceptions, libraries in the US mainly engage in
"copy cataloging." This has some incredible efficiencies in terms of
cataloger time. At the same time, we have decentralized catalogs for
the most part - each library (or group of libraries) has its own
system with a bibliographic database. The "shared" bibliographic
resources get copied into each library database. So the cataloging
activity is shared but the storage of data is local. (This is what
OCLC appears to be trying to address with WorldCat local -
eliminating that redundant local storage.)
I am assuming that there will continue to be local systems in
libraries because they are part of the overall management
functionality of the institutions (accounting, inventory, etc.). I
also consider those outside of the immediate purview of BIBFRAME.
What people do with data in their local systems is kind of their own
business. I am assuming (whether or not I am right is part of the
question) that BIBFRAME addresses solely (primarily?) bibliographic
data sharing. If you want to keep your local database in MARC or
ISBD or MAB, you can, as long as when you share you participate with
data that conforms to the agreed standard.
What my question relates to is: what is meant by "local" in a shared
data environment? There is reference in BIBFRAME to the "local
authority" and I'm not clear what that means. For example, in 2.1 of
the document on authorities it says: "
In this way, it would function as a local access point (HTTP URI)
for the person or concept." Does "local access point" mean that
groups of libraries would not share the same "lightweight
abstraction layer." Could Stanford and Harvard choose to share some
BIBFRAME authority data, on the open Web? (Which will then lead to:
could they choose to share some bibliographic data?) The Stanford
data would be published onto the Web from Stanford's system and the
Harvard data would be published from the Harvard System; the users
would do searches and see displays but since they don't see URIs the
identities of the elements would be irrelevant.
I probably was skipping forward when I put forth my examples. I'm
trying to imagine a work-flow for shared bibliographic data. In my
mind it goes like this:
- National library X receives book and creates bibliographic and
authority data.
- That data goes into one or more open repositories, or on the open
web, where other libraries can make use of it.
- ... then there are questions:
? - if data is copied to some local system, does it matter if URIs
are stored as is, or are re-minted to that library's distinctive
domain? (I answer that above, in a way: local system choices are a
local matter)
? - when a library shares its data with others (data that it has
received via copy cataloging, not that it has created), does it
matter if the original URIs are shared, or must the library export
all data with its own minted URIs?
? - in the authority document "direct linking" (that is, using the
actual URI of the shared authority data rather than a local URI) is
discouraged. If a library is adhering strictly to a shared authority
pool, isn't direct linking a local system choice, as is caching of
display strings?
The examples that I gave essentially asked about the sharing of this
"local authority record" vs. intake of any valid triple (and thus,
by extension, use of direct linking).
Again, I put some diagrams in my blog post that were more at a macro
level, but now I think I need a new set that is more specific to
this case.
kc
On 5/25/13 8:16 AM, Jörg Prante wrote:
[log in to unmask]" type="cite">I'd
like to elaborate on this a little bit, it's not really a
Bibframe-specific thing, it's more about how information sharing
on the Semantic Web can work for library cataloging.
I.
RDF Turtle example with anonymous resources. Only works can be
linked.
Harvard:
hu:work9 a bf:Work;
hu:author [
a hu:Author;
hu:label "some name here";
hu:authority lcna:PersonA
] .
Stanford:
su:WorkSu7 a bf:Work;
su:author [
a su:Author;
su:label "some name here";
su:autority lcna:PersonA
] ;
owl:sameAs hu:work9.
This is a perfectly valid example. The information is doubled.
By saying su:WorkSu7 owl:sameAs hu:work9, a semantic reasoner will
assume those bf:Work resources as equal, no matter if the authors
are equal or not (another question is if that makes sense). This
could be asserted by another statement for property equivalence,
hu:author owl:samePropertyAs su:author
II.
RDF Turtle example with explicit identified resources. Works and
Persons can be linked.
Harvard:
hu:PersonF a bf:Authority;
hu:label "some name here";
hu:authority lcna:PersonA .
hu:Work9 a bf:Work;
hu:author hu:PersonF .
Stanford:
su:PersonAbc a bf:Authority;
su:label "some name here";
su:authority lcna:PersonA .
su:WorkSu7 a bf:Work;
su:author su:personAbc .
Now, the equivalence of two resources can be asserted by Stanford:
su:PersonAbc owl:sameAs hu:PersonF .
su:WorkSu7 owl:sameAs hu:Work9 .
or vice versa by Harvard:
hu:Work9 owl:sameAs su:WorkSu7 .
hu:PersonF owl:sameAs su:PersonABC .
This is also perfectly making sense. Each library is cataloging in
its domain, "hu:" or "su:". Note, there is no fixed rule for
declaring owl:sameAs by Bibframe or other institutions.
Maybe the assertion was added by a deduplication routine, maybe by
manual editing. If both catalog departments work in parallel, one
could say, there is duplication of work. And by consulting the
RDF, the triples would show the "truth".
(That is when I can't resist to remember Andrew Osborn's "The
Crisis in Cataloging", 1941, and the subsequent work of Seymour
Lubetzky).
III.
The idea of having shared information by an authoritative source
is having something like this:
lcna:PersonA a bf:Authority;
rdfs:label "some name here" .
So Harvard and Stanford can use inferencing to get to the label
"some name here" (and all other properties of lcna:PersonA) with a
single triple:
su:WorkSu56 a bf:Work;
su:authority lcna:PersonA .
hu:Work10 a bf:Work;
hu:authority lcna:PersonA .
And the whole LCNA information is shared, no duplication of work.
And if Harvard wants to point to the Stanford work, because they
later discovered the usefulness or correctness of it, they could
again assert semantic equivalence by:
hu:Work10 owl:sameAs su:WorkSu56 .
As a consequence, when third parties will reuse catalog
information from Harvard and from Stanford, a semantic reasoning
engine will add information by looking at the rules for both
resources, no matter if hu:Work10 or su:WorkSu56 is selected for
reasoning.
The choice between "hu:" or "su:" can be tedious, so a union
catalog might come handy. This would look like this:
union:WorkABC a bf:Work;
owl:sameAs hu:Work10, su:WorkSu56 .
and for those who want to trust "union:", they can use
union:WorkABC instead of hu:Work10 or su:WorkSu56.
These examples all make perfect sense.
Here in Germany, we have distributed union catalogs, completely
unlinked on the work level yet, but recently linked at authority
level. I'm very happy about GND, which was the result of unifying
locally scattered authorities, so we now have a unified german
authority catalog for getting started with RDA. And we are fully
aware of the challenges that are ahead of us when more globally
valid entities will be introduced.
With Bibframe and the Semantic Web, all variants will exist, they
can't be forbidden. But the understanding of what is happening
with the catalogs and the automatic resolving towards a "best
effort" solution by logical rules executable by machines will be
easier than without Bibframe.
Jörg
Am 25.05.13 15:09, schrieb Karen Coyle:
On 5/24/13 1:01 PM, Ford, Kevin wrote:
I do think that BIBFRAME should have a
way to keep together an external
authority identifier and the local display forms.
-- I don't know if I understand your point correctly, but, to
me, the BIBFRAME Authority as a lightweight abstraction layer
meets this need quite nicely. It is a resource that provides
a means to store local display forms and link to an external
authority.
I'm beginning to wonder what we mean by "local". I think that I
fall into thinking about local systems and what they will have,
but that is an assumption about the future that may not prevail.
I earlier asked about the re-use of BIBFRAME authorities, using
the example below. The primary question is: does each library
holding an item mint a new URI for an authority? In other words,
do the 6,000+ libraries that hold a copy of Harry Potter #1 each
have a separate "local" URI for J K Rowling?
Here's the example I gave:
- There is an LCNA identifier and description for PersonA, call
it
lcna:PersonA
- Harvard catalogs a book by that author (original cataloging).
Harvard
creates a BIBFRAME Work description and a BIBFRAME Authority,
using the
Harvard domain (call it HU). We now have:
HU:Work9 -> author -> HU:PersonF
HU:PersonF -> label -> "some name here"
HU:PersonF -> authority -> lcna:PersonA
Later, Stanford uses the HU data for copy cataloging. Does
Stanford now
have:
HU:Work9 -> author -> HU:PersonF
HU:PersonF -> label -> "some name here"
HU:PersonF -> authority -> lcna:PersonA
Or does Stanford have:
SU:WorkSu7 -> author -> SU:PersonAbc
SU:PersonAbc -> label -> "some name here"
SU:PersonAbc -> authority -> lcna:PersonA
That is, does the original BIBFRAME authority identity get
re-used, or does copy cataloging result in the minting of new
URIs for each entity? (This is more a "best practice" question
than a "what is technically possible" question.)
Then if at some later date Stanford does original cataloging for
another Work by PersonA, and would it create:
SU:WorkSu56 -> author -> SU:Person12
SU:Person12 -> label -> "some name here"
SU:Person12 -> authority -> lcna:PersonA
Or would Stanford re-use its own URI for that person?
SU:WorkSu56 -> author -> SU:PersonAbc
SU:Person12 -> label -> "some name here"
SU:Person12 -> authority -> lcna:PersonA
Obviously, I'm not asking what *will* happen, I'm asking what we
think *should* happen. And to me this is mixed in with the
question of "local" and what "local" means.
kc
I'm still not sure
when it makes sense to link from the local BIBFRAME
"authority" and
other external authorities (VIAF, or national libraries in
other
countries).
-- As you surmise, ideally "national and other shared
authority files would link to each other" and one could just
follow the links. But there may also be times when someone
finds a source of information about a Person that may not be
available via one of the links of an authority file. So, for
example (and this is off the top of my head), perhaps a German
library uses the GND for its authority control for the form of
a name, but then finds an alternate source for more detailed
information about the Person (that is not linked to from the
GND or VIAF). That library could then link its BIBFRAME
Authority resource for that Person to this additional source
in order to include information from that source in its
display.
Cordially,
Kevin
-----Original Message-----
From: Bibliographic Framework Transition Initiative Forum
[mailto:[log in to unmask]] On Behalf Of Karen Coyle
Sent: Friday, May 24, 2013 2:28 PM
[log in to unmask]">To:[log in to unmask]
Subject: Re: [BIBFRAME] Authorities: updating
On 5/24/13 6:52 AM, Trail, Nate wrote:
As far as "where is the link for updates", I think it would
probably
not be the same link, since so many flavors of update would
need to be
handled. A system would need to know how to interpret the
link and get
to the flavor of update it wants (JSON serialization of the
full info,
rdfxml of just the label, etc).
that makes sense, Nate. And therefore it is possible that
the update
link may be different from the more general "authority"
link. And
presumably this is another area where versioning will be
important --
e.g. if you don't update your system from "cookery" to
"cooking", that
wouldn't be so much an error as an earlier version.
I do think that BIBFRAME should have a way to keep together
an external
authority identifier and the local display forms. I'm still
not sure
when it makes sense to link from the local BIBFRAME
"authority" and
other external authorities (VIAF, or national libraries in
other
countries). I'm not against it, I'm just not coming up with
a use case
at the moment. In general, I would assume that national and
other
shared authority files would like to each other as a matter
of course.
kc
Each authority would have to maintain an API for such
updates. That
said, the ID link, with content negotiation, is actually a
good step in
that direction, since you can get all these formats of the
record right
there:
Alternate Formats
. RDF/XML (MADS and SKOS)
. N-Triples (MADS and SKOS)
. JSON (MADS/RDF and SKOS/RDF)
. MADS - RDF/XML
. MADS - N-Triples
. MADS/RDF - JSON
. SKOS - RDF/XML
. SKOS - N-Triples
. SKOS - JSON
. MADS/XML
. MARC/XML
We do not, of course, have a push service that tracks change
dates
(yet).
Nate
From: Bibliographic Framework Transition Initiative Forum
[mailto:[log in to unmask]] On Behalf Of Karen Coyle
Sent: Thursday, May 23, 2013 6:34 PM
[log in to unmask]">To:[log in to unmask]
Subject: [BIBFRAME] Authorities: updating
Here is the example of a BIBFRAME Authority from the
Authorities
document:
<Organization id="http://bibframe/auth/org/ifla">
<label>
IFLA Study Group on the Functional
Requirements for
Bibliographic
Records
</label>
<link>http://www.ifla.org/</link>
<hasIDLink
resource="http://id.loc.gov/authorities/names/nr98013265"
/>
<hasVIAFLink
resource="http://viaf.org/viaf/148620313" />
<hasDNBLink
resource="http://d-nb.info/gnd/2167628-8" />
</Organization>
I still haven't an answer from an earlier question (now lost
in all of
this email) as to whether there will be a specific link from
the
BIBFRAME Authority to the actual shared authority file used
by the
library - that is, the file from which the library derives
what here is
shown as the "<label>". The example above shows links
to LCNA, DNB and
VIAF, but it isn't clear if any one of those is singled out
as the
authority being used by the library. Why does this matter?
It matters
because if the library intends to be part of an authority
community,
they have to be able to receive updates from the shared
authority file,
and therefore there must be a link between the shared
authority file
and the local usage of a term. I illustrate this in the
diagrams I did
at:http://kcoyle.blogspot.com/2013/05/bibframe-authorities.html
(see
esp. 2nd diagram).
If there isn't such a link then I do not see how libraries
will be able
to keep their names in sync with the authority file to which
they
adhere.
We also haven't talked about alternate names. The examples
show a
single name form. Indexing requires alternates. Both single
names and
alternates often comes from a shared authority file (that
isn't local)
and both types of name forms can change. What is the link
that makes
change management work with the BIBFRAME Authorities?
kc
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet