Print

Print


Hi,

I would like to share my personal opinion. 

The German National Library (DNB) has released the GND in an RDF Turtle Dump under a CC0 
license. More information: 

http://www.dnb.de/DE/Service/DigitaleDienste/LinkedData/linkeddata_node.html

-> Download der Linked Data Dumps

What does that mean? Well, each of us can download a GND base set in RDF, put it into a search 
engine, for example into Elasticsearch (it took only a few minutes for indexing 9.493.987 subject 
URI-based documents consisting of a total of 97.267.642 triples), or into a triple store like 4store, 
and start locally using GND as a source for authority control and for mixing up and building 
mashups with other bibliographic and non-bibliographic data.

Setting up an OAI client - for example - completes the scenario. By fetching RDF/XML updates on a 
regular basis from DNB you will always have the most recent authoritative data. This is not a vision, 
it is reality.

A central triple store would be a major drawback. It does not scale. Each library has a lot of users 
and applications, a single triple store would soon collapse. A good strategy is to put URI-based 
authorities under an open data license and encouraging and enabling everybody around the world 
to use it, too. 

With RDF you can organize the data not only in records, but also in a graph of bibliographic 
entities. Such a graph has a wealth of sub-graphs, attributes, and other properties. Well, if you 
prefer, you can interpret the RDF graph of the GND as a record sequence of subject URIs as you 
would do with MARC record collections, for example, to build searchable documents. But you are 
no longer restricted to the record model. An RDF graph has an abstract semantic interpretation 
and follows the rules of the W3C, describing statements about resources and facts (literals), having 
rules in ontologies that are also part of the Semantic Web etc.

By using one of the many RDF serializations, bibliographic data can be packaged for transport 
purposes. If you need to transport such packages over the wire, you can choose between formats 
such as N-Triples, N3, Turtle, or RDF/XML. You are no longer restricted to the record-centered ISO 
2709 format family with ancient character encodings, or XML wrappers around ISO 2709 that are 
inheriting all the weaknesses from ISO 2709, since they are not aware of how to link to external 
bibliographic entities or to reference them in a stable, reliable manner.

We all know, with the Internet, the massive number of mobile devices, and broadband connectivity, 
transporting records in file packages from one place to another - like our elders need to do on 
magnetic tapes due to the lack of affordable online transport capacity - is becoming more and 
more the exception. The typical read access on catalog entities today is performed as lookups via a 
growing number of web browsers and other web clients. These clients need to search documents, 
traverse links and reference related information in many not foreseeable ways. So, methods for 
bibliographic file packaging should be seamlessly connected to such popular use cases, i.e. how 
the data is used later on the web. 

Technologically, an RDF-based framework is a remarkable difference, it means that libraries are 
joining Tim Berners-Lee's effort to interpret the World Wide Web as a global database where 
everyone (even machines) can use (bibliographic) entities automatically just because they are part 
of the Web, and not just being exposed to the Web.

Best regards,

Jörg