Hello Thomas; all,
Some comments in-line.
On 16.7.2014 18:41, Thomas Berger wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Just for the record:
> ISBN, especially since 2007, have very strict semantics, at least when
> you ask the isbn agency (their FAQ will gladly tell).
I was a member in the ISO working group which revised the ISBN
standard. In my opinion ISBN has always had clearly defined semantics,
and that did not change for better or worse in the latest revision.
> Sadly especially libraries took their part in blurring that semantics,
> but that is a different story.
Can you elaborate a bit what you mean by this?
In the ISO working group I mentioned, people representing publishers
wanted to change ISBN into a non-semantic (ISSN-like) identifier. One of
the arguments used was that when publisher disappears, information
provided by ISBN is no longer valid. However, the working group members
representing libraries and retailers did not want that kind of overhaul
of the system. And from library point of view disappearance of the
publisher has no impact on who published the book in the first place.
Other problem with ISBN semantics is that in some countries publishers
would want to assign the same ISBN to all versions of a book. Libraries
are opposed to such practice.
> Real world identifiers like ISBN play an important role and it would be
> bold to deny that or to assume that this role has become obsolete because
> there is now the new and shiny semantic web and its URIs solving any
> problems the legacy identifiers are stuck with.
Not only bold but foolish. Identifier assignment must be a managed
process; otherwise it is difficult to guarantee identifier persistence
and uniqueness. With something like urn:isbn:<isbn> I know that this
string is a reliable identifier; with random URIs it is anybody's guess
whether they were even intended (and by whom?) to serve as identifiers.
> In a sense, real world identifiers are quite different from semantic
> web identifiers. ISBNs for instance have an internal structure
> reflecting the delegation of the identifier space to agencies and
> publishers, and they have a check digit for processing. And contrary
> to semantic web principles these properties are "leaked" on purpose
> to the community. This helps recognizing ISBNs in different contexts,
> validating them and hunting down ressources - very practical tasks
> in the real world but of no relevance to the semantic web.
There are traditional bibliographic identifiers which are non-semantic;
ISSN is an example of that. Many new ISO identifier systems do not have
internal structure (apart from having identifier string and check digit)
and identifier assignment is centralized.
Even if the identifier is semantic it may be difficult to find out where
it can be resolved. ISBN is a good example of this, as you can see from
the ISBN namespace registration and its revised version. For instance,
the resolver for isbn's starting with 978-3 may be located in Germany,
Austria or Switzerland (assuming that URN:ISBN resolvers are maintained
in the national level).
>> An interesting challenge for library catalogs would be if e.g. the
>> publishing industry started to move their ISBN numbering system into the
>> web and introduced URIs, also for existing ISBNs. Then we'd have two
>> aspects of the same thing - the web resource of the ISBN and the legacy
>> ISBN thing, the "string thing". In such cases, a skilled programmer must
>> perform the "heavy lifting" so that a catalog still can ensure that
>> equivalent ISBNs of whatever semantics are still identifying the same thing
>> - since it is the user of the library catalog that is looking for the one
>> and only result that may be denoted by many different forms of an
ISBN does not belong to the publishing industry; the main interest
groups involved with its development are publishers, book retailers and
libraries. These groups have occasionally had conflicting interests but
have always been able to agree on how to proceed.
Both DOI and URN have specified how to migrate ISBNs into URIs. People
may also incorporate ISBNs into ARKs, Handles, PURLs, etc. These
persistent identifiers provide (different) services and therefore they
should all be incorporated in bibliographic records.
> A more interesting question would be why the ISBN agency never introduced
> official URIs for their numbers or endorsed the URN:ISBN scheme: Maybe
> the too feared some kind of pollution: Imagine what would happen if
> publishers started to print URIs on the book jackets, probably in
> addition to the existing representations as strings and bar codes. And
> the potential for confusion in the presence of misspellings, typos or
> general cluelessness. Also I think that current business applications
> would not gain from URI versions of ISBNs: The already /know/ what they
> are dealing with.
International ISBN agency was closely involved with the registration of
the URN namespace. The registration request was written by myself and
the (past) chairperson of the International ISBN Agency. I work with the
current chairperson to revise the existing ISBN namespace registration
(revision is required due to the introduction of ISBN-13).
In my opinion, International ISBN Agency has definitely endorsed
I agree that current ISBN users know what they are dealing with. We need
to make ISBNs actionable, and URN enables us to do this. For instance,
several universities and polytechnics in Finland are using URN:ISBNs to
provide persistent links to digitized versions to (doctoral)
dissertations. These links include the address of the URN resolver
(maintained by the national library) like this:
Our business applications have definitely benefited from this. And these
URLs could be printed to books, since URLs like the one above are likely
to be persistent.
>> My perception is that Bibframe is not providing any consensus mechanisms
>> except for "bibframe entities", which are only a small excerpt of the
>> Semantic Web. An open story is when e.g. catalogs are used outside the
>> library community scope, or merged into new data pools, and entities have
>> to be matched, Bibframe and non-Bibframe ones. This is not a new topic and
>> it is not specific to Semantic Web, but it is a strong advantage of the
>> Semantic Web. Maybe there is high hope for improved library catalog data
>> consensus by using Bibframe, but I am about to lose my optimism.
> Bibframe will not be in a position to cut the bounds of libraryland with
> real world phenomena. Thus even if it could impose exactly one kind of
> string or - even better - URI representation for ISBNs and other identifiers
> for all libraries in the world - what would be gained by that? Publishers
> and patrons will continue to confront us with real world forms of real
> world identifiers and continuously transforming between different
> representations of identifiers or denominations of ressources will remain
> one of the main tasks of our applications.
If URI representations of ISBNs and other traditional identifiers are
based on persistent identifiers such as URNs, DOIs, Handles etc., the
chances of avoiding chaos are improved because identification of
resources (and establishment of links from identifiers to resources
themselves) remains a managed process. If any patron can mint URIs to
any resource and claim that they are (persistent) identifiers, the whole
system is standing on clay feet, starting from the fact that nobody will
ever own a domain; we can only rent them.
PS. As a creature of "libraryland", my views may be slanted, but I have
never thought that libraries would not be part of the real world :-).
Any organization which has managed to stay in business as long as we
have must have done a lot of things right.
> viele Gruesse
> Thomas Berger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> -----END PGP SIGNATURE-----
The National Library of Finland
Library Network Services
P.O.Box 26 (Teollisuuskatu 23)
FIN-00014 Helsinki University
Tel. +358 9 191 44293
Mobile +358 50 3827678