LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


BIBFRAME@LISTSERV.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  August 2015

BIBFRAME August 2015

Subject:

Re: BIBFRAME Identifier, Role, and Authority Proposals

From:

Thomas Berger <[log in to unmask]>

Reply-To:

[log in to unmask]

Date:

Sun, 23 Aug 2015 19:14:04 +0200

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (439 lines)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 18.08.2015 um 17:02 schrieb Fulford, Elizabeth Anne:
> The following three proposals have been added to the list of
> BIBFRAME Vocabulary Change Proposals
> <http://www.loc.gov/bibframe/docs/index.html>

I'd like to comment first on the

> *         BIBFRAME Identifier Proposal
>
<http://www.loc.gov/bibframe/docs/pdf/bf-identifierproposal-08-12-2015.p
df>
[PDF, 47KB] (August 12, 2015)


We had a lengthy discussion about the nature of identifiers about
one year ago, unfortunately without a clear result, real-world-
identifiers (i.e. those which exist outside of URIs exclusively
used in mainstream semantic web environments) seem to be a complex
matter.

Some insights or recapitulation of facts were

* there is no (RDF) way of "inspecting" URIs and therefore no
  method of answering the question of "which dataset does the
  resource belongs to" from the URI alone (the "which dataset"
  question I consider roughly equivalent to "what kind of
  identifier is it")

* most identifier schemes which interest us (even relative recent
  ones like ISIL) do not define canonical URIs as part of their
  specificiation.

* There exist different string representations for a given
  identifier scheme, for "1455502626” from the example given
  the official representation would be "ISBN 978-1-4555-0262-2"
  (whether 1455502626 and 9781455502622 are two representations
  of the same ISBN or two ISBN from two different ISBN schemes
  may be open to dispute).

* These format or formatting issues could be levelled by associating
  each identifier with a specifically crafted data type instead of
  "string", but this would impose a huge barrier for usage.

* However even the choice of identifier scheme may be arbitrary:
  From "ISBN 978-1-4555-0262-2" there is a derived UPC/EAN
  9781455502622 often represented by a barcode with backup
  notation 978 1 4555 0262 2. Or "ISBN 978 1 4555 0262 2" in
  cases an US publisher puts it at a different place on the
  resource...

* Being Real World Identifiers means we do not necessarily detect
  them but they may be communicated to us. Therefore a need to
  record (and mark) syntactically wrong ("illegal") identifiers
  and attempt to make use of them as if they were legal ones.
  (a datatype solution probably could not cope with that situation)

* Recording identifiers makes sense even in situations where they
  pertain to entities different from those we are dealing with:
  An ISSN imprinted on a monograph means that the monograph is
  published in a series which can be identified by that ISSN.
  We just happen to know enough ISSN semantics to notice that,
  others might not. ISBN semantics is very strictly controlled
  by the ISBN agency (the manual prescribes very detailed what
  can be assigned an ISBN and in which cases new ISBNs have to
  be shelled out) but an museum description could provide an
  ISBN as part of the physical description of a certain item.
  Usage of Real World Identifiers in the sense of owl:sameAs
  might be very dangerous in those situations (or be asking too
  much from the producers of that description: The strength
  of identifiers lies in the fact that they can be successfully
  /used/ without the full grasp on domain-specific knowledge
  necessary to /assign/ them)

So, different from concept URIs, Bibframe transports Real World
Identifiers in connection to a resource, however correct
interpretation of that connection cannot be recorded and
provided and is left to the consuming applicatoin.

The example from the proposal

<http://example.com/xyz/Instance2>
  bf:identifiedBy [
    a bf:Identifier ;
    bf:scheme “xyz” ;
    rdf:value “1455502626”
  ] .

reflects the attempt of the recorder of the information to gather
enough context "xyz" for the identifier to be meaningful.

Having a meaning for "xyz" enables us to conclude the Class of
objects "1455502626" might stand for *and* to compare it with
a "978 1 4555 0262 2" for the same meaning found on a different
resource.

In the context of a given scheme we may have grossly diverging
representations of the same identifer, thus a minor question
would be, whether rdf:value or rdfs:label might be more appropriate
to transport that. Also I'm missing what representation is
expected to be inserted here:
The resource may read "1 4555 0262 6" and the official representation
according to that scheme "ISBN 978-1-4555-0262-2" thus “1455502626”
(might be something extracted from a MARC record) is a representation
not very tightly connected to neither the bf:instance nor the
rules governing application of the identifier scheme in question...

Certainly, an URI would be preferable to a string "xyz" to faciliate
identification of the same scheme. Again, there are no generally
accepted URIs for concepts like "ISBN as such" and introduction of
a vocabulary for that should be the way to go. However reuse of
the http://id.loc.gov/vocabulary/identifiers is not a good idea,
even for an example, since that describes a collection of rdf:property.

So one always can state

<http://example.com/xyz/Instance2>
  identifier:isbn "1455502626".

and now also saying

<http://example.com/xyz/Instance1>
  bf:identifiedBy [
    a identifier:isbn ;
    rdf:value “1455502626”
 ] .

would mean to confine the meaning of "bf:identifiedBy" on the fly? From
my understanding of RDF it rather declares the string “1455502626”
to be a specific instance of a property of class identifier:isbn ...

Building up a Bibframe specific registry of "recognized" Identifier
Schemes could indicate preferred notational conventions, and also give
pointers to definitions and agencies controlling these schemes.




> *         BIBFRAME  Role Proposal
> <http://www.loc.gov/bibframe/docs/pdf/bf-roleproposal-08-12-2015.pdf>
[PDF, 56KB] (August 12, 2015)


The traditional elements bf:creator and bf:contributor might have been
provided with the correcpoding (unqualified or qualified) Dublin Core
elements in mind: Being able to map creators to creators and
contributors to contributors without deeper knowledge seems to be a
good thing. To make this practice valid, Bibframe would have to
submit to DC semantics, i.e. bf:creator is tied to "primary
responsibility" and bf:contributor to "making contributions" and
personally I'm not sure whether classifying roles along these lines
will be an smooth operation for all possible Bibframe applications
(e.g. are "creators" for an frbr:expression to be considered
"contributors" of a non FRBR-aware description?)

In the example

bf:contributor [
  a bf:Contributor ;
  bf:role [
    a bf:Role ;
    rdfs:label “creator” ;
    bf:identifiedBy <http://id.loc.gov/vocabulary/relators/cre>
  ] ;
  bf: agent [...] .

my naive understanding would be that bf:identifiedBy for the bf:role
provides a concept URI in the sense of owl:sameAs and therefore the
contraction

bf:contributor [
  a bf:Contributor ;
  bf:role <http://id.loc.gov/vocabulary/relators/cre> ;
  bf:agent [...];

should also be a valid way to express the same relation.
(<http://id.loc.gov/vocabulary/relators/cre> directly being of class
bf:Role would be implied then, determining a suitable label would
be left to applications, I can't see any problems in that)

However objects of bf:identifiedBy are of class bf:Identifier, so we
must read the example as shorthand form

  bf:role [
    a bf:Role ;
    rdfs:label “creator” ;
    bf:identifiedBy [
      a bf:Identifier
      bf:scheme "????" ;
      rdf:value "???"
      ];
  ] ;
or
  bf:role [
    a bf:Role ;
    rdfs:label “creator” ;
    bf:identifiedBy [
      a suitablesubclassofRoles ;
      rdf:value "???"
    ];
] ;


Does <http://id.loc.gov/vocabulary/relators/cre> provide us with the
necessary statements? It may be advantageous to regard
http://id.loc.gov/vocabulary/relators as a collection of *codes*,
so would be "cre" the identifier value and
"http://id.loc.gov/vocabulary/relators" the scheme (as string)? How
would one be able to use
that a as (registered) subclass of bf:Identifier, i.e. who would
assert
<http://id.loc.gov/vocabulary/relators> rdf:subclassOf bf:Identifier


In natural language I would perceive a "contributor" being a subclass
of "agent" *and* I'm not sure to what extent the concept of an agent
(any entity that acts or may act) exceeds that of an ordinary Person
or Corparaty Body in the sense that the action (i.e. the role!) in
a fixed situation is already included.

So wouldn't an "relator" instead of "contributor", i.e.

bf:relator [
  a bf:Relator ;
  bf:role [
    a bf:Role ;
    rdfs:label “creator” ;
    bf:identifiedBy <http://id.loc.gov/vocabulary/relators/cre>
  ] ;
  bf:agent [...] .

stress the fact that bf:Relators are some "abstract location" where
agents and their roles are brought together and prevent casual readers
from falling in the trap of imagining the range of bf:relator to be
already persons, corporate bodies &c.?

The relator could be accompanied with a snippet of text extracted
from an actual resource, e.g. "put into words by X.Y." or more
generally: We are "fattening" a simple relation between resource
and agent by introducing an intermediary node and edge to record the
precise kind of role. That should give us enough room for further
elaboration, e.g. providing some justification or annotate the
statement. But if I understand correctly, it is possible to *embed*
an bf:Annotation via bf:hasAnnotation parallel to bf:role and
bf:agent in our relator/contributor container and this would give
cataloguers the opportunity to comment on any peculiarities like
spelling errors, unusual places for information found, degrees
of incertainity and so on related to the act of determining the
relation between resource and agent?


The

> *         BIBFRAME Authority Proposal
> <http://www.loc.gov/bibframe/docs/pdf/bf-authorityproposal-08-12-2015.
pdf>
> [PDF, 45KB] (August 12, 2015)>

exemplifies

bf:contributor [
  a bf:Person ;
  rdfs:label "Joachim Knape" ;
  bf:identifiedBy <http://id.loc.gov/authorities/names/n80103961#RWO>;
  bf:identifiedBy <https://viaf.org/viaf/268367832/#Knape,_Joachim>
] .

which in the light of the role proposal might be read as

bf:agent [
  a bf:Person ;
  rdfs:label "Joachim Knape" ;
  bf:identifiedBy <http://id.loc.gov/authorities/names/n80103961#RWO>;
  bf:identifiedBy <https://viaf.org/viaf/268367832/#Knape,_Joachim>
] .

and if I understand the proposal correctly it is about removing all
authority-specific classes or relations from Bibframe: That would be
completely taken over by the identifier proposal.

Thus the following comment is rather about usage of the identifier
proposal in the authority proposal example:

taking "RWO" as a hint for Real World Object this IMHO goes completely
cross:

We can state identity of our bf:Person in question with some entities
known to others by means of URIs for the non-information resources:

bf:agent [
  a bf:Person ;
  rdfs:label "Joachim Knape" ;
  owl:sameAs <http://id.loc.gov/authorities/names/n80103961>;
  owl:sameAs <http://viaf.org/viaf/268367832>
] .

(or - following VIAF's example - rather <http://schema.org/sameAs>
instead of owl:sameAs?)

On the other hand we can transport identifiers known for that
resource:

bf:agent [
  a bf:Person ;
  rdfs:label "Joachim Knape" ;
  bf:identifiedBy [
    a bf:Identifier;
    bf:identifierScheme "lcnaf";
    bf:identifierValue "n  80103961  ";
    ];
  bf:identifiedBy [
    a bf:Identifier;
    bf:identifierScheme "viaf";
    bf:identifierValue "268367832";
    ];
] .

Leaving away my concerns about using id.loc.gov/vocabulary/identifiers
(The LC identifiers "belong" to
<http://id.loc.gov/authorities#conceptscheme> and I'd rather like
that URI to be used) the schemes could be made explicit:

bf:agent [
  a bf:Person ;
  owl:sameAs <http://id.loc.gov/authorities/names/n80103961>;
  owl:sameAs <http://viaf.org/viaf/268367832>
  rdfs:label "Joachim Knape" ;
  bf:identifiedBy [
    a identifier:lccn;
    bf:identifierValue "n 80103961 ";
  ];
  bf:identifiedBy [
    a identifier:viaf;
    bf:identifierValue "268367832";
  ];
] .


One now would wish to optionally add two additional data items:

1. The information resource associated with the authority number
For library authority files I would say statements like

<http://id.loc.gov/authorities/names/n80103961>
  rdfs:isDefinedBy <http://id.loc.gov/authorities/names/n80103961.rdf>

make much sense: The (concrete person as a) concept can - at least
approximately - be defined by the data given in the authority record.

We want to keep identifier, scheme and definition URL close together,
and therefore provide the URL in the context of our bf:Identifier,
which itself is /not/ defined by the authority record.

 bf:identifiedBy [
   a identifier:lccn;
   bf:identifierValue "n 80103961 ";
   bf:authorityData <http://id.loc.gov/authorities/names/n80103961.rdf>
];

Here "bf:authorityData" provides the link to actual data from the
authority record associated to our identifier. Stronger than
rdfs:isDefinedBy the statement promises that an actual document
can be retrieved from that URI.

Analoguously:


bf:identifiedBy [
  a identifier:viaf;
  bf:identifierValue "268367832";
  bf:authorityData <http://viaf.org/viaf/268367832/rdf.xml>
];

[this tentative bf:authorityData may be the old "bf:hasAuthority", I'm
not quite sure about that]


2. VIAF is an example of an authority file not providing one (single)
preferred label for its resources but legacy applications probably
would appreciate if the "heading" would be provided, not only the
identifier (the rdfs:label "Joerg Knape" from the example obviously
is not recorded from the sources identifiers provided for). Especially
in datasets like LCNAF where authorized access points actually
encode most information acutally needed for definition/identification
this label is identifying by itself:

bf:identifiedBy [
  a identifier:lccn;
  bf:identifierValue "n 80103961 ";
  bf:authorityLabel "Knape, Joachim (1950-)";
  bf:authorityData <http://id.loc.gov/authorities/n80103961.rdf>
];

(slightly doctored, atypically the LC heading for the person does
not contain the birth date. Note that
http://id.loc.gov/authorities/names/n80103961.rdf redirects to
http://id.loc.gov/authorities/names/n80103961.rdf and
http://viaf.org/viaf/sourceID/LC%7Cn+80103961.rdf.xml
redirects to http://viaf.org/viaf/268367832/rdf.xml and rdf.xml is not
the only description format available)

[I'm quite certain that bf:authorityLabel is the current
bf:authorizedAccessPoint. However this is not about stating the
authoritative form of the identifier, but our bf:Person is only
described by the document (can be identified with the help of data
contained in the document) http://viaf.org/viaf/268367832/rdf.xml
As an additional step we notice that this description employs the
concept of a preferred name and note this in a standardized way
as bf:authorityLabel. We could have used that form for our
rdfs:label of the person, but in the example we obviously didn't and
preferred "Joachim Knape"]

My impression is that the identifier proposal is taking over the role
of bf:Authorities. Taking the natural language meaning of "identifiedBy"
into account, it seems legitimate to optionally enrich the object of
such statements with actual links to authority records (or entries in
biographical resources) and reporting key data which can be extracted
from these. However considering these objects as of class "Identifier"
seems to massively overstretch the natural language meaning. The
question arises is the identifier proposal really about (real world)
identifiers or rather about general identification mechanisms, where
knowledge of identifiers-as-strings is only one, albeit central,
aspect.

viele Gruesse
Thomas Berger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iJwEAQECAAYFAlXZ/1sACgkQYhMlmJ6W47O9egP/fXIX33iAPDVYpyuC+lGGeyak
3qsXqM2tH+2x/vruUOZQ8wRwxAPZ/rdn8+H11qnPeKeZkWBqf4528xosjQaCgjRi
UQBi78N2lrE25r/vyt/EazSwX48LEj4UqcILzx1C9CrsnNx2XaWmlia/FDsgJUcv
F8XEp3kYVk0dfmkuDSw=
=7dCb
-----END PGP SIGNATURE-----

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
July 2011
June 2011

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager