Karen Coyle wrote:
> The difference that I see between PURLs and VIAF IDs is that VIAF IDs
> are limited to the VIAF service, while anyone can create a PURL, and
> there is not a single application for things identified by PURLs.
> It seems to me that many URIs tend to be created around a single service
> or application.
> http://xmlns.com/foaf/ for FOAF
> http://rdvocab.info/ for RDA elements and vocabularies
> http://dewey.info/ for the Dewey in RDF
These URIs will only remain actionable as long as the service exists and
the domain name registrations are maintained. It will be difficult to
evaluate whether the level of organizational commitment behind a service
/ domain is sufficient. Still, such an evaluation should be made, since
there will be plenty of links in our data, and they should all be kept
alive. Dead links would / will seriously endanger the usefulness of
linked data. I wonder if the pioneers have gathered any statistics of
how long the links work?
Generally, if an identifier has to be truly permanent, it should be
independent of any underlying technologies (such as http protocol or
domain name system).
> Something in the URI gives you context. With PURLs that is not the case.
> (Ditto URIs under vocab.org, which also make use of the PURL service.)
Trustworthiness of the context provided by the domain name in the URI
will vary a lot, depending on the stability of the organization which
has registered the domain. Domains belonging to e.g. the national
libraries and archives may be relatively stable, but not every domain
will be stable enough for creating identifiers based on it.
> So here is yet another area of "best practices" worth discussing. I see
> nothing wrong with
> http://id.loc.gov/ for LC authorities
> because LC is truly the owner for these vocabularies. VIAF makes sense
> to me because OCLC *hosts* the service but is not the owner of the data.
> It would make sense to me that ISBNs would have URIs under
> http://isbn.org rather than http://bowker.com.
These days the lifetime of a publisher can be short. I have no problem
indicating the publisher in the registrant element within the ISBN
string (not all people agree; when the ISBN was last revised, some other
members in the working group were in favour of a "dumb", ISSN-like
ISBN), but it would be short-sighted indeed to rely on publishers'
domain names to make ISBNs actionable. A good best practice is to have
as few and as reliable resolution services as possible.
As few as possible will mean different things, depending on the
identifier system. Since there is no global ISBN database, it would be
hard to provide a satisfactory resolution service via http://isbn.org.
There is also an additional problem that this domain does not belong to
the International ISBN Agency, but to the U.S. ISBN Agency, hosted by
The ISSN International Centre could support a single point of access
thanks to the ISSN Registry. Some other non-semantic ISO identifiers
also require a centralized database which will simplify establishment of
resolution services. But ISBNs are more complex. When used as URNs,
they can be made actionable in the national level, provided that the
resolution discovery service can parse the registration group element in
order to see which ISBN resolver(s) will be able to respond. RFC 3187
discusses this at length.
Please note that in the absence of the global resolver discovery service
it is necessary to use URNs as HTTP URI:s, and in the case of ISBNs
these URIs will tell the corrrect resolution service, so URNs such as:
do not need any further processing to be actionable.
Parsing the URN namespace specific string in order to find the correct
urn:isbn resolver will only become necessary when it is possible to use
plain URNs as hyperlinks. We do not know when that will happen.
> At the same time, if I needed to create a single RDF property to fill
> out my metadata (for example, if I was using DCterms for everything
> else) then it would make sense to me to use http://purl.org or
> http://vocab.org as a simple way to obtain a URI, rather than getting a
> separate domain name (that I might forget to renew).
Persistence of domain names is an organizational problem. If an
individual forgets to renew a registration (which can happen), the
mistake can be corrected later on only if somebody else has not already
re-registered the domain (nobody ever owns a domain name permanently).
If the organization responsible disappears, somebody else must carry the
torch onwards. We do not know in advance if this will happen or not. In
contrast, persistent identifiers are permanent and will never be
re-assigned. This applies to URN namespace registrations as well as ISBN
registration group elements.
> I don't, however, have a rule for this, nor even an idea of what are the
> practical reasons for choosing one of these possibilities over another.
> Having a best practice doesn't mean that you would force people to
> follow it, it means that if people are pondering their choices you give
> them useful guidance. Why make everyone spend energy thinking about this?
Different strategies may co-exist, so not everyone needs to apply the
same best practice.
RDA elements and vocabularies use URIs based on the domain
http://rdvocab.info/. If there were any doubt concerning the longevity
of this domain, or quality of services available via it, a library could
create linked data by assigning persistent identifiers (URNs, Handles,
PURLs) to the data elements. The PID resolver maintained by the library
would then provide the links from the PID to the RDA element description
at rdvocab.info. The local resolver, being smart, might also provide
additional links to e.g. translations of the element descriptions stored
locally, or other services of local importance.
If for any reason the http://rdvocab.info would no longer be accessible,
a replacement service could be established anywhere in the Web. There
would be no need to change the PID-based links in the data; an update of
the PID - URL -mapping in the resolution service would do.
Since there can be a lot of copies of data, we should at all costs avoid
using un-cool links at the data level. And when the level of coolness is
evaluated it is better to be sceptical than to believe that all services
will stay alive for a long time.
>> The separate domain names make them much cooler.
Perhaps. It may be true that viaf.org is cooler than e.g. viaf.oclc.org
would have been, but I cannot see much difference between these domains
if both of them have been registered by the same organization. And it is
not possible to use separate domain names systematically. For instance,
viaf.org was available and could be used, but isni.org was not, so the
name of the ISNI database is isni.oclc.nl. But I assume that both
viaf.org and isni.oclc.nl are meant to be (and will be) cool.
>> -----Original Message-----
>> From: Bibliographic Framework Transition Initiative Forum
>> [mailto:[log in to unmask]] On Behalf Of Juha Hakala
>> Sent: Thursday, February 09, 2012 8:53 AM
>> To: [log in to unmask]
>> Subject: Re: [BIBFRAME] The German National Library's response
>> Karen Coyle wrote:
>>> Juha, thanks for the info regarding IETF activity. The issue I see
>>> URNs is not the structure but the minting: should libraries begin to
>>> link their data I see a need for thousands or even tens of thousands
>>> identifiers (hundreds of thousands?) when we figure out a way to make
>>> library holdings available to the linked data space. Surely we'll need
>>> at least an identifier for each library. At least URIs piggy-back on
>>> domain system, which already exists.
>> Yes, a lot of identifiers will be needed. And if someone prefers to use
>> URNs for this purpose, RFC 3188bis (the revised namespace registration
>> request for National Bibliography Numbers, NBNs) makes it clear that
>> these identifiers can be assigned to data elements as well.
>> Where these URN:NBNs resolve to and what kind of services they will be
>> able to support will depend on the technical infrastructure available.
>>> Definitely, this gives us something to think about, and I have no
>>> that we could develop some kind of naming/identifying system to carry
>>> this data. Obviously the first step is to figure out what we need to
>>> identify, a kind of requirements study.
>> Yes; and in addition we may need to consider what kind of services the
>> identified things require.
>>> What I dislike about the persistent identifier is that you lose the
>>> to the originating agency that you have in the URI. That might be just
>>> "human thing" - that I feel better when looking at the URI that I can
>>> see WHO is responsible.
>> A persistent identifier may show the originating agency as well. Whether
>> they do or don't, depends on the identifier system used. With URN:NBN
>> the namespace specific string (the identifier part of the URN) may be
>> semantic, if that is the preference of the organization assigning those
>> identifiers. But in the long run it may not be a good idea to include
>> the originating agency into the identifier, since organisations (and
>> even more so, their domain names) may be more short-lived than the
>> things they create. Cool URIs, just like semantic identifiers, may tell
>> who originated the resource, but there is a good chance that they do not
>> tell who is currently responsible for keeping the resource available. A
>> different method for finding this out must be available.
>> ARKs, of course, give you both, at least in
>>> theory. Is anyone using the "?" feature of ARKs that lets you query
>>> that information? Should such info be part of our best practices?
>> I don't know if the "?" and "??" features of ARK are in use, and if so,
>> by whom. John Kunze may be able to tell that. But I do think that
>> providing this functionality in a PID system is a good idea, and will
>> "lend" it into the URN system (in case John doesn't mind ;-)). Although
>> the practical implementation in the URN system will probably be an
>> option of retrieving preservation metadata / rights metadata about the
>> Revised version of the URN syntax (RFC2141bis) allows the use of<query>
>> and<fragment>.<query> will never be part of the URN, but it could be
>> used to carry service-related information. For example, this base URN:
>> provides the user the default service (splash page describing the
>> resource, and providing a link to the book), but this URN:
>> will supply descriptive metadata about the resource in the default
>> format, provided that the resolution service knows how to deal with the
>> service request in<query> (I2C = URI to resource description).
>> In the context of linked data, we might be interested in enabling for
>> instance retrieval of the definition of a concept in the chosen language
>> (?ENG for English, ?SWE for Swedish, and so on). Whatever linking
>> mechanisms are used (PIDs, cool URIs or something else) they should
>> enable us to do whatever needs to be done.
>> Links are an essential feature in linked data, and we should plan
>> carefully the implementation of this functionality - and not take for
>> instance the functionality cool URIs are currently providing as the
>> predetermined basis for our work.
>> All the best,
>>>>>> - what should the URI resolve to?
>>>> URN-related RFCs are currently being revised (see
>>>> http://datatracker.ietf.org/wg/urnbis/). I am currently writing a new
>>>> version of RFC 2483, which specifies the resolution services URN can
>>>> provide. In the present RFC 2483 the list of services is fixed. RFC
>>>> 2483bis will be based on the idea that IANA should establish a
>>>> of informal and formal resolution services. Then URN user communities
>>>> could register new services at will (and parameters to these
>>>> for instance for requesting descriptive metadata about the resource
>>>> different formats).
>>>> Existing persistent identifier systems provide a diverse set of
>>>> services. With ARK, for instance, it is possible to check the
>>>> preservation commitment of the organisation holding a resource. I
>>>> know if the PID systems will become more homogeneous in this respect
>>>> the future.
>>>> Nobody knows what the URIs utilized within this initiative should
>>>> resolve to, but I am sure that the mechanism to be built should be
>>>> flexible so that it can be adjusted to meet the future needs we don't
>>>> foresee yet.
>>>> Best regards,
>>>>>> That kind of thing.
>>>>> Does anyone know an answer to any of these questions? Therefore, I
>>>>> think, no URI is better than no URI at all. Use brief and simple and
>>>>> easily memorized codes for vocabularies like the terms in 337-338,
>>>>> use IDnumbers for names and subjects and titles.
>>>>> Any implementation can easily relate them to all sorts of URIs that
>>>>> be in current use or follow best practice or resolve to something
>>>>> useful for the purpose at hand. Verbal terms need changes and are
>>>>> language-bound, URLs are perishable, only codes and numbers are
>>>>> easy to handle, and versatile.
Senior advisor, standardisation and IT
The National Library of Finland
P.O.Box 15 (Unioninkatu 36, room 503), FIN-00014 Helsinki University
Email [log in to unmask], tel +358 50 382 7678