Hash: SHA1

Am 11.07.2014 23:49, schrieb Denenberg, Ray:

> ·         I agree with Jeff Young who said (if it really was Jeff – hard to tell)
> ‘ The abandoned "info" URI effort leaves me skeptical that non-HTTP URIs can be systematically described in general.’
> (This is a battle that I fought for years, but I long ago accepted defeat.)  And I honestly think we should treat isbn, issn, etc. – even fully formulated URNs – as string identifiers and not try to turn these into actionable URIs.

Perhaps with emphasis on "we".

To give two examples:

The Gemeinsame Normdatei (GND) of the German National Library clearly
has identifiers. To many of us they are known as in the form of the
example string "123799465". However, in the MARC community they are
known as "(DE-588)123799465". The DNB Website and MARC21 representations
of the authority record state as "other standard identifier" some
"" either to be considered as a "weblink"
or as an identifier sourced from some "uri" identifier system.

Following Ray's comment maybe we have outwitted ourselves by "knowing" that
<> is the URI for the identifier "123799465"
and anyhow "(DE-588)" and "" are just some namespacey
way to identify the identifier system for the real identifier following
these prefixes.

Maybe all three forms just are string representations for some abstract
GND identifier of the resource: Equivalent with respect to the
resource they are identifying and distinct when it comes to different
contexts where their use is encouraged or not permissible:
* "(DE-588)123799465" is mandatory in MARC contexts
* "" is the string representation for
  the /official/ URI <>
* "" is the string representation for
  an officialy provided actionable URI / URL <>

The important point is, these equivalences, transformations and
interpretations are properties of that particular identifier system
and their validity is declared, guaranteed and technically maintained
by some body (DNB) responsible for "operating" this identifier system.
This body issues statements that identifiers like "123799465" and
"" pertain to the same resource, may
be turned into actionable URLs and how this can be done. (One might
argue that the equivalence of "123799465" and "(DE-588)123799465"
is a statement issued by LC in its role as the MARC standards body
and there especially as maintainer of the list of organizational codes.
Or - since these codes are defined as to be ISILs - a joint statement
of LC and the ISIL agency maintaining "DE-588" as identifier for the
GND as such)

[Note that the "prefix URI" < > is not web
actionable and there is no evidence that this URI was ever used
to identify the GND as a database or web application, or the dataset
of all concepts covered by individual GND records, nor the set of
all GND identifiers emitted so far or the space of all possible
GND identifiers or GND URIs]

Now GND and VIAF are some of the few identifier systems which provide
us with official URIs and actionable URLs at all. Many more systems
do not have these properties, even quite recent ones like ISIL or ISNI.

Consider ISBNs as another example:

* There is the "old" form "1-59158-509-0" of an ISBN and the "new"
  (EAN) form "978-1-59158-509-1" (I've choosen a publication from
  2007 for my example to avoid discussions that one should be
  preferred over the other).
* If I recall correctly the ISBN agency states that ISBNs shall
  be used (imprinted) with dashes and a prefix "ISBN" followed by
  a space: "ISBN 1-59158-509-0" rsp. "ISBN 978-1-59158-509-1"
* And the forms "1 59158 509 0" and "978 1 59158 509 1" commonly
  printed by US publishers into the resource.
* Not to forget the forms "1591585090" and "9781591585091" as
  recorded in 020$a of MARC21 records.

* And there are URN:ISBN:1-59158-509-0 by RFC 3187 and
  info:isbn/1591585090 from the "info" URI scheme/registry
  To my knowledge none of the two approaches ever has been
  acknowledged or endorsed by the ISBN agency

All these strings are equivalent identifiers when considered /as/
ISBN but again in different usage contexts only certain representations
are allowed: MARC21 does not allow to record "ISBN 978-1-59158-509-1"
in field 020 although the ISBN agency declares this as /the/ official

In this situation we have many communities issuing equivalence
statements for string representations of "abstract" ISBNs:
- - the agency (ISBN-10 <-> ISBN-13 transformation of the dashed forms)
- - (some) librarians (MARC21 form)
- - (some) publishers ("blanked" forms)
- - ??? (ubiquitous eqivalence of ...-x and ...-X)
- - IETF (URN:ISBN scheme)
- - OCLC (info:isbn scheme)

I don't think bibframe will ever be able to enforce the usage of
one of these representation styles as preferred over all of the others
 - even the ISBN agency had not been able to enforce the official
form. And it will not desirable to always supply the complete zoo
of equivalent strings for every resource.

Thus there will be systems (as there are people) which will not
be able to detect that the identifier strings presented by two
ressources are equivalent within the ISBN context and actually
represent the same (abstract) ISBN. And neither bibframe itself
nor the kind of reasoning or deference currently available in
the semantic web will be able to remedy that.

To conclude:
- - Many of our favourite identifiers are and will remain strings

- - since not all of these strings are URIs we'll have to indicate
  what identifier system the belong to (bibframe might provide
  a registry providing URIs for the identifier systems in a
  vocabulary-like manner)

- - Also strings which look like URIs should be acoompanied by
  information to which identifier system they are to be
  associated - when used as identifiers: Distilling a common
  prefix from uniformly build URIs is not a permitted operation
  /and/ we would not know wether the thus extracted URI
  "URN:ISBN:" should represent the ISBN identifier system as
  such or the Dataset of all ressources identified by ISBNs
  (and we propbably cannot afford to neglect that distinction)

- - Most of our identifier systems have specific and non-trivial
  equivalence rules for the strings (considering them to be opaque
  as demanded for URIs won't be of any help) often reflecting common
  usage in different communities. Not even the maintainers of
  the identifier systems will have knowledge about all of these
  convenience forms, let alone bibframe.

viele Gruesse
Thomas Berger
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird -