Print

Print


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I very much object to crafting new non-http URI schemes, please let me
elaborate:

Am 18.07.2014 22:17, schrieb Denenberg, Ray:
[...]
> While there are no subspaces common to:
>
> ‘info:’
http://info-uri.info/registry/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc) and
>
> ‘urn:’ (http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xml)
>
> There is quite a body of literature that claims that this or that
> identifier, in the ‘info’ table, but not in the ‘urn’ table, is expressible as a
> URN. These include sici, doi, and hdl, and perhaps others. I am assuming that
> the rule here would be to express these as ‘info:’ URIs, and never as URNs.
>
>
>
> No URI form
>
> I am hoping that there is unanimous consensus that an identifier whose
> scheme is not listed in the URN registry should never be expressed as a URN.

Historically, "URN" (uniform resource *name*) was a concept constrasting to
"URL" (uniform resource *locator*). For instance the OAI-identifier
specification <
http://www.openarchives.org/OAI/2.0/guidelines-oai-identifier.htm > (dated 2002,
document last revised 2006) has to be interpreted from that background:

"oai-identifiers are Uniform Resource Names (URNs) in the sense of RFC1737; they
are resource identifiers and not resource locators (URLs)."


  % --

Current usage of "URN" however is based on RFC 2141, i.e. URIs within the
"urn:" prefix scheme, and RFC261 outlining registration and namespace
procedures (superseded by RFC3406).

Accordingly, IANA maintains two lists of prefixes in the sense of a registry:

1. Uniform Resource Identifier (URI) Schemes
  (< http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml >,
  since long almost completely unrelated to the "Service Name and Transport
  Protocol Port Number Registry") grouped in
 a) Permanent URI schemes like "gopher", "info", "http", "mailto", "z39.50r"
    and "urn"
 b) Provisional URI Schemes like "cvs", "secondlife", "svn", "webcal"
 c) Historical URI schmees like "fax", "wais", "z39.50"

"urn" stands out here since it is the only prefix whose namespace is
also administered as an IANA registry:

2. Uniform Resource Names (URN) Namespaces
  (< http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml >)
  lists "Formal URN namespaces" /within/ the urn: prefix system like

  "urn:isbn", "urn:issn", "urn:nbn", or "urn:ietf", "urn:iso", and probably
  a couple more of the schemes listed there are highly relevant to us.

Many of "our" common schemes "ark", "hdl", "oai" or "doi" are not mentioned in
any of the two registries. Therefore the "info" registry remains important,
listing ( <
http://info-uri.info/registry/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc >):

info:ark/
info:doi/
info:hdl/
info:sici/
(and also some "common" database identifiers like "lc", "lccn", "oclcnum",
"bnf" ...)

still no "oai". However the "info:ofi" scheme for the OpenURL framework
allows to map *any* syntactically correct (but otherwise privately crafted)
URI into the IANA uri delegation scheme by prefixing it with
"info:ofi/nam:info:" ;-)

The OAI identifier standard proposes "oai:" URIs as elegant means to
transform DNS based namespace prefixing into URNs (as they call them,
we'd say URIs). Although there is no internet police preventing any
community from using convenience URI schemes not registered with IANA
these IMHO should be generally avoided for Bibframe purposes.


  % --

In 2010 the info registry officially declared itself closed for the
registration of new schemes, cf. the "Notice" < http://info-uri.info/ >.
The argument was/is, that current Linked Data activities stress the utility
of web actionable (dereferenciable) URIs and the http: scheme together with
DNS provided namespace delegation provides a much more flexible and
lightweight mechanism than centralized registries of URI schemes and
sub-schemes, especially since the usually do not have resolution
mechanisms and even if they have the standard tools (browsers and so
on) do not know about them.

And indeed we see web friendly, official, canonical and persistent
identifiers emerge, like

http://catalogue.bnf.fr/ark:/12148/cb12284219c
http://www.idref.fr/03167142X/id
http://id.loc.gov/authorities/names/n82141174
http://lccn.loc.gov/n82141174
http://d-nb.info/gnd/117712582

(of which only the last is declared as "the official URI in Linked Data
sense" of the resource by the maintaining agency. The others are mere
"common convenience wrappers" for the string identification numbers
established by broad usage within several communities.

  % --

Although there are "good" URIs (persistently dereferenciable) but not
better or "best" ones. Whoever is making statements about a resource
does so by assigning an URI to it and formulating his assertions
with that URI in subject position. Our previous discussion has shown
that nothing is gained by recommending to always use private URIs
(within a namespace controlled by the agent formulating the statements)
but it likewise has shown that there is no univerally canonical URI
all statements could (or should) converge to. [<
https://en.wikipedia.org/wiki/Uniform_resource_identifier#History > reports that
the "U" in URI
stood for "Universal" in RFC1738 (actually RFC1630 used as reference there)
and was changed to "Uniform" in RFC 2396]

Our traditional identifiers however usually make claims for universality
(and sadly lack uniformity as we have seen), at least for the community
they are created for.
The GND number "117712582" as expressed in http://d-nb.info/gnd/117712582
is /published/ as an identifier together with the promise of a
major institution to reserve this number (and URI) eternally to
the resource currently described by the content you can fetch by
resolving the URL or looking up the number in the Web database.

Second and third parties refering to these numbers and/or URIs /implicitly/
claim identity of their respective resources, regardless wether they
use the published URI syntactically as subject URI of their statements
or a privately crafted.

bf:identifiers provide a mechanism to model this usage even in case
of string-only identifiers (or /only/ in this case), but admittedly
they introduce a degree of indirection one would like to avoid.
However

< http://d-nb.info/gnd/117712582 > bf:type bf:person.
does not express the fact that this person is known to the GND
authority file and has a certain identifier there.
I have to combine it with the statements at
< http://d-nb.info/117712582/about/rdf >, namely
< http://d-nb.info/gnd/117712582 > gndo:gndIdentifier "117712582".
But for that I need deeper knowledge of the gndo vocablulary typically
beyond my average reasoning capabilities (gndo:gndIdentifier serves
the same purpose as bf:identifierValue only the identifierScheme
is implicit).

The more practical problem however is the lack of uniformity in
real world identifiers: "n82141174" is a semi-official variant of
the "real" identifier "n  82141174 " as noted in Field 010$a of
< http://id.loc.gov/authorities/names/n82141174.marcxml.xml >.
Thus I too see the need for providing prospective bibframe users
with guidance when it comes to canonicalization of identifiers
occuring in common identification systems.

However I would propose to do this *not* by means of the URN or info
URI schemes, but rather by a space of http URIs within some
"bibframe" namespace. With that I could state:

<http://d-nb.info/gnd/117712582> bf:type bf:person;
  bf:identifier <http://bibframe.org/registry/GND/117712582>.

and resolving <http://bibframe.org/registry/GND/117712582>
could provide me with additional statements like

<http://bibframe.org/registry/GND/117712582> a bf:Identifier;
   bf:identifierRegisteredScheme <http://bibframe.org/registry/GND>
   bf:identifierValue "117712582";
   bf:commonURI <http://d-nb.info/gnd/117712582>.
>

and <http://bibframe.org/registry/GND> would be a "registration"
document describing the "GND" authority file and give hints about
common forms of identifiers, their usage and especially what
forms of URIs or URLs using these identifiers are commonly
known (in the GND example there is not much to tell but to
report the fact that these http-URIs actually already are intended
to serve all purposes and demands in linked data contexts).

NOT resolving <http://bibframe.org/registry/GND/117712582> (or
not even making the correponding statement) would still have
the advantage of <http://d-nb.info/gnd/117712582> being a
common way of denoting the resouce and give a high probability
that my statements can "coincidentally" be merged with statements
from other parties.

In the /absence/ of any community practice for crafting URLs or
URNs from identifiers (which usually evolves rather straightforward
as soon there are resolvable URLs related to the numbers)
the hypothetical bibframe registry could provide a second
namespace as kind of cristallization point for URIs:

<http://bibframe.org/registry/ISIL> would be a "registration"
document describing "ISIL", give hints about common forms
of identifiers and their usage /and/ it would document that
a prefix "http://bibframe.org/aux/ISIL/" has been reserved
for crafting "good" URIs.
[German wikipedia states that the ISIL system has applied for
registration within the info URI registry but considering that the
registry is closed I don't see that ever happen. Also ISILs are
defined by ISO 15511 but to my knowledge this does not imply that
a registry for that standard automatically can extend the official
urn:iso space with its identifers.]

Therefore I am encouraged to formulate:

<http://bibframe.org/aux/ISIL/DE-Bo413> bf:type bf:Agent

and have advantages from that alone. Optionally I can amend this
by
   bf:identifier <http://bibframe.org/registry/ISIL/DE-Bo413>
which would provide me with additional statements about that
"DE-Bo413" /is/ an ISIL, where to find data associated with
it in the ISIL system and so on.

Note that the registry and the aux namespace would never be
implemented as real RDF stores, they are (for each of the
individual registered schemes) mere "resolvers", providing
programmatically derived reformulations and transformations
of identifiers according to the facts and knowledge about
those kinds of identifiers collected in the registration
process and expressed in the "registration document" mentioned
above.


viele Gruesse
Thomas Berger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iJwEAQECAAYFAlPKhKEACgkQYhMlmJ6W47MXxwQAm4ujXh2JTyu3CURWDLl2eDQ9
TK9+blwZ2rGgfuXgVntyxUp+UPGk8iOwi5/VSHvLMYNvEuLb2ARMWyVlcvX0Nig/
yjsuZYCJxR88PmZfMnhztTS0xGX4p66ayqjBdQ4lf5Rr+WrjWQv6fBpI8lKkhvC4
uBHBfzcoMQKp3IhDTSg=
=RL3s
-----END PGP SIGNATURE-----