Print

Print


On 2/13/12 2:17 AM, Juha Hakala wrote:

>
> These URIs will only remain actionable as long as the service exists and
> the domain name registrations are maintained. It will be difficult to
> evaluate whether the level of organizational commitment behind a service
> / domain is sufficient.

Domain names or URNs are not going to make identifiers more or less 
reliable. Those are just names, and reliability comes from the 
organization or institutional commitment behind the name. My interest is 
in being able to know what the institution  or project is, so that I get 
a hint to the reliability of the identifier. The issue is "hint" vs. "no 
idea", which is how I see the PURL situation.

Even with a rock solid-seeming domain name, like xmlns.com/foaf/0.1/, 
you have to dig a bit to determine who is behind it. My objections to 
using FOAF for library data have been that although it was developed as 
a kind of W3C project it's really the work of a few individuals, with no 
institutional commitment, and not a very broad base of standards 
participants.

Tom Baker, of DCMI, has been suggesting that libraries take on the 
commitment of archiving and keeping alive ontologies that lose their 
backing, somewhat in the same way that we store and make available other 
kinds of materials. It does seem that there will need to be an actual 
effort to keep identifiers alive in this new environment, and I'm not 
yet sure who should do that. Libraries are one logical idea.

I like the idea of the ARK with its "call and response" about the 
institution and commitment, but it isn't being used, AFAIK.

The next few years we are going to see an explosion of identifier 
creation, and some of this will be under our control. Should we be 
thinking about how we go about this?

kc


  Still, such an evaluation should be made, since
> there will be plenty of links in our data, and they should all be kept
> alive. Dead links would / will seriously endanger the usefulness of
> linked data. I wonder if the pioneers have gathered any statistics of
> how long the links work?
>
> Generally, if an identifier has to be truly permanent, it should be
> independent of any underlying technologies (such as http protocol or
> domain name system).
>
>> Something in the URI gives you context. With PURLs that is not the
>> case. (Ditto URIs under vocab.org, which also make use of the PURL
>> service.)
>
> Trustworthiness of the context provided by the domain name in the URI
> will vary a lot, depending on the stability of the organization which
> has registered the domain. Domains belonging to e.g. the national
> libraries and archives may be relatively stable, but not every domain
> will be stable enough for creating identifiers based on it.
>
>> So here is yet another area of "best practices" worth discussing. I
>> see nothing wrong with
>>
>> http://id.loc.gov/ for LC authorities
>>
>> because LC is truly the owner for these vocabularies. VIAF makes sense
>> to me because OCLC *hosts* the service but is not the owner of the
>> data. It would make sense to me that ISBNs would have URIs under
>> http://isbn.org rather than http://bowker.com.
>
> These days the lifetime of a publisher can be short. I have no problem
> indicating the publisher in the registrant element within the ISBN
> string (not all people agree; when the ISBN was last revised, some other
> members in the working group were in favour of a "dumb", ISSN-like
> ISBN), but it would be short-sighted indeed to rely on publishers'
> domain names to make ISBNs actionable. A good best practice is to have
> as few and as reliable resolution services as possible.
>
> As few as possible will mean different things, depending on the
> identifier system. Since there is no global ISBN database, it would be
> hard to provide a satisfactory resolution service via http://isbn.org.
> There is also an additional problem that this domain does not belong to
> the International ISBN Agency, but to the U.S. ISBN Agency, hosted by
> Bowker.
>
> The ISSN International Centre could support a single point of access
> thanks to the ISSN Registry. Some other non-semantic ISO identifiers
> also require a centralized database which will simplify establishment of
> resolution services. But ISBNs are more complex. When used as URNs, they
> can be made actionable in the national level, provided that the
> resolution discovery service can parse the registration group element in
> order to see which ISBN resolver(s) will be able to respond. RFC 3187
> discusses this at length.
>
> Please note that in the absence of the global resolver discovery service
> it is necessary to use URNs as HTTP URI:s, and in the case of ISBNs
> these URIs will tell the corrrect resolution service, so URNs such as:
>
> http://urn.fi/URN:ISBN:978-952-10-7579-7
>
> do not need any further processing to be actionable.
>
> Parsing the URN namespace specific string in order to find the correct
> urn:isbn resolver will only become necessary when it is possible to use
> plain URNs as hyperlinks. We do not know when that will happen.
>
>> At the same time, if I needed to create a single RDF property to fill
>> out my metadata (for example, if I was using DCterms for everything
>> else) then it would make sense to me to use http://purl.org or
>> http://vocab.org as a simple way to obtain a URI, rather than getting
>> a separate domain name (that I might forget to renew).
>
> Persistence of domain names is an organizational problem. If an
> individual forgets to renew a registration (which can happen), the
> mistake can be corrected later on only if somebody else has not already
> re-registered the domain (nobody ever owns a domain name permanently).
> If the organization responsible disappears, somebody else must carry the
> torch onwards. We do not know in advance if this will happen or not. In
> contrast, persistent identifiers are permanent and will never be
> re-assigned. This applies to URN namespace registrations as well as ISBN
> registration group elements.
>
>> I don't, however, have a rule for this, nor even an idea of what are
>> the practical reasons for choosing one of these possibilities over
>> another. Having a best practice doesn't mean that you would force
>> people to follow it, it means that if people are pondering their
>> choices you give them useful guidance. Why make everyone spend energy
>> thinking about this?
>
> Different strategies may co-exist, so not everyone needs to apply the
> same best practice.
>
> RDA elements and vocabularies use URIs based on the domain
> http://rdvocab.info/. If there were any doubt concerning the longevity
> of this domain, or quality of services available via it, a library could
> create linked data by assigning persistent identifiers (URNs, Handles,
> PURLs) to the data elements. The PID resolver maintained by the library
> would then provide the links from the PID to the RDA element description
> at rdvocab.info. The local resolver, being smart, might also provide
> additional links to e.g. translations of the element descriptions stored
> locally, or other services of local importance.
>
> If for any reason the http://rdvocab.info would no longer be accessible,
> a replacement service could be established anywhere in the Web. There
> would be no need to change the PID-based links in the data; an update of
> the PID - URL -mapping in the resolution service would do.
>
> Since there can be a lot of copies of data, we should at all costs avoid
> using un-cool links at the data level. And when the level of coolness is
> evaluated it is better to be sceptical than to believe that all services
> will stay alive for a long time.
>
>>> The separate domain names make them much cooler.
>
> Perhaps. It may be true that viaf.org is cooler than e.g. viaf.oclc.org
> would have been, but I cannot see much difference between these domains
> if both of them have been registered by the same organization. And it is
> not possible to use separate domain names systematically. For instance,
> viaf.org was available and could be used, but isni.org was not, so the
> name of the ISNI database is isni.oclc.nl. But I assume that both
> viaf.org and isni.oclc.nl are meant to be (and will be) cool.
>
> Best regards,
>
> Juha
>>>
>>> --Th
>>>
>>> -----Original Message-----
>>> From: Bibliographic Framework Transition Initiative Forum
>>> [mailto:[log in to unmask]] On Behalf Of Juha Hakala
>>> Sent: Thursday, February 09, 2012 8:53 AM
>>> To: [log in to unmask]
>>> Subject: Re: [BIBFRAME] The German National Library's response
>>>
>>> Hello,
>>>
>>> Karen Coyle wrote:
>>>
>>>> Juha, thanks for the info regarding IETF activity. The issue I see
>>> with
>>>> URNs is not the structure but the minting: should libraries begin to
>>>> link their data I see a need for thousands or even tens of thousands
>>> of
>>>> identifiers (hundreds of thousands?) when we figure out a way to make
>>>> library holdings available to the linked data space. Surely we'll need
>>>
>>>> at least an identifier for each library. At least URIs piggy-back on
>>> the
>>>> domain system, which already exists.
>>>
>>> Yes, a lot of identifiers will be needed. And if someone prefers to use
>>> URNs for this purpose, RFC 3188bis (the revised namespace registration
>>> request for National Bibliography Numbers, NBNs) makes it clear that
>>> these identifiers can be assigned to data elements as well.
>>>
>>> Where these URN:NBNs resolve to and what kind of services they will be
>>> able to support will depend on the technical infrastructure available.
>>>>
>>>> Definitely, this gives us something to think about, and I have no
>>> doubt
>>>> that we could develop some kind of naming/identifying system to carry
>>>> this data. Obviously the first step is to figure out what we need to
>>>> identify, a kind of requirements study.
>>>
>>> Yes; and in addition we may need to consider what kind of services the
>>> identified things require.
>>>
>>>> What I dislike about the persistent identifier is that you lose the
>>> link
>>>> to the originating agency that you have in the URI. That might be just
>>> a
>>>> "human thing" - that I feel better when looking at the URI that I can
>>>> see WHO is responsible.
>>>
>>> A persistent identifier may show the originating agency as well. Whether
>>>
>>> they do or don't, depends on the identifier system used. With URN:NBN
>>> the namespace specific string (the identifier part of the URN) may be
>>> semantic, if that is the preference of the organization assigning those
>>> identifiers. But in the long run it may not be a good idea to include
>>> the originating agency into the identifier, since organisations (and
>>> even more so, their domain names) may be more short-lived than the
>>> things they create. Cool URIs, just like semantic identifiers, may tell
>>> who originated the resource, but there is a good chance that they do not
>>>
>>> tell who is currently responsible for keeping the resource available. A
>>> different method for finding this out must be available.
>>>
>>> ARKs, of course, give you both, at least in
>>>> theory. Is anyone using the "?" feature of ARKs that lets you query
>>> for
>>>> that information? Should such info be part of our best practices?
>>>
>>> I don't know if the "?" and "??" features of ARK are in use, and if so,
>>> by whom. John Kunze may be able to tell that. But I do think that
>>> providing this functionality in a PID system is a good idea, and will
>>> "lend" it into the URN system (in case John doesn't mind ;-)). Although
>>> the practical implementation in the URN system will probably be an
>>> option of retrieving preservation metadata / rights metadata about the
>>> resource.
>>>
>>> Revised version of the URN syntax (RFC2141bis) allows the use of<query>
>>>
>>> and<fragment>.<query> will never be part of the URN, but it could be
>>> used to carry service-related information. For example, this base URN:
>>>
>>> http://urn.fi/URN:ISBN:978-952-10-7612-1
>>>
>>> provides the user the default service (splash page describing the
>>> resource, and providing a link to the book), but this URN:
>>>
>>> http://urn.fi/URN:ISBN:978-952-10-7612-1?I2C
>>>
>>> will supply descriptive metadata about the resource in the default
>>> format, provided that the resolution service knows how to deal with the
>>> service request in<query> (I2C = URI to resource description).
>>>
>>> In the context of linked data, we might be interested in enabling for
>>> instance retrieval of the definition of a concept in the chosen language
>>>
>>> (?ENG for English, ?SWE for Swedish, and so on). Whatever linking
>>> mechanisms are used (PIDs, cool URIs or something else) they should
>>> enable us to do whatever needs to be done.
>>>
>>> Links are an essential feature in linked data, and we should plan
>>> carefully the implementation of this functionality - and not take for
>>> instance the functionality cool URIs are currently providing as the
>>> predetermined basis for our work.
>>>
>>> All the best,
>>>
>>> Juha
>>>>
>>>> kc
>>>>
>>>>>
>>>>>>> - what should the URI resolve to?
>>>>>
>>>>> URN-related RFCs are currently being revised (see
>>>>> http://datatracker.ietf.org/wg/urnbis/). I am currently writing a new
>>>>> version of RFC 2483, which specifies the resolution services URN can
>>>>> provide. In the present RFC 2483 the list of services is fixed. RFC
>>>>> 2483bis will be based on the idea that IANA should establish a
>>> registry
>>>>> of informal and formal resolution services. Then URN user communities
>>>>> could register new services at will (and parameters to these
>>> services,
>>>>> for instance for requesting descriptive metadata about the resource
>>> in
>>>>> different formats).
>>>>>
>>>>> Existing persistent identifier systems provide a diverse set of
>>>>> services. With ARK, for instance, it is possible to check the
>>>>> preservation commitment of the organisation holding a resource. I
>>> don't
>>>>> know if the PID systems will become more homogeneous in this respect
>>> in
>>>>> the future.
>>>>>
>>>>> Nobody knows what the URIs utilized within this initiative should
>>>>> resolve to, but I am sure that the mechanism to be built should be
>>>>> flexible so that it can be adjusted to meet the future needs we don't
>>>>> foresee yet.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Juha
>>>>>
>>>>>>>
>>>>>>> That kind of thing.
>>>>>>>
>>>>>>
>>>>>> Does anyone know an answer to any of these questions? Therefore, I
>>>>>> think, no URI is better than no URI at all. Use brief and simple and
>>>>>> easily memorized codes for vocabularies like the terms in 337-338,
>>> and
>>>>>> use IDnumbers for names and subjects and titles.
>>>>>> Any implementation can easily relate them to all sorts of URIs that
>>> may
>>>>>> be in current use or follow best practice or resolve to something
>>>>>> useful for the purpose at hand. Verbal terms need changes and are
>>>>>> language-bound, URLs are perishable, only codes and numbers are
>>> robust,
>>>>>> easy to handle, and versatile.
>>>>>>
>>>>>> B.Eversberg
>>>>>
>>>>
>>>
>>
>

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet