Print

Print


We discussed this question 2011 in the culturegraph project, see e.g. my question at answers.semanticweb.com [1] (caution: I didn't fully understand xsd and custom datatypes at that time). There are only replies to this question that either suggest using URIs as identifiers (like Rob) or using specific properties and nobody advocates using custom datatypes. Generally, specific properties are a more elegant approach and it's quite awkward writing a SPARQL query where you take into account datatypes. Also - as Jeff already said - SPARQL runs into problems comparing custom datatypes. (see also [3])

The same question came up again developing the DINI-KIM (KIM = Competence Centre Interoperable Metadata) working group's recommendations for publishing bibliographic data about textual resources as RDF, see [2] (German). The DINI KIM recommendations are bascially an application profile for publishing bibliographic data in RDF with the goal to reuse as much existing properties as possible without creating new ones. These recommendations include both the use of identifier-specific properties and of custom datatypes (see below).

Another question in the context of representing identifiers in RDF is, that some identifiers are identifying a bibliographic resource like a book, journal etc. (ISBN, ISSN, DOI etc.) while others are identifying a bibliographic record (OCLC number [4], LCCN [5] etc.). See [6] for an illustration of the two different kinds of identifiers. So, if you take the rigorous distinction between a resource and its description in the Linked Data world into account you might have problems with asserting something like:

http://example.org/moby-dick
    dc:title "Moby Dick ;
    bf:identifier http://lccn.loc.gov/56014046 .

I think this is a pseudo-problem but I thought it might make sense if Bibframe made this clear once and for all. As it already is common practice to use record IDs "metaphorically" as IDs for the bibliographic resource described by the identified record and as one can assume that this practice can't be changed, we should stick with this approach. Also, I can't see where it does any harm.

However, as I said we pondered on this again in the DINI KIM working group in addition to the closely related question how to link to same/similar resource for which already an HTTP-URI and an RDF description exists. We came up with the following proposal:

1. Identifiers that exist in form of a URI (like URN and DOI) won't be asserted with dc:identifier or something similar but there will be a link to the resource using the umbel:isLike property. (Using owl:sameAs is always problematic in case you are not totally sure you have two URIs for the very same resource. False identity assertions might lead o incorrect inferencing. (For the problems with the use of owl:sameAs see the resources listed at [7].)
2. Well known identifiers like OCLC number or LCCN will be named using the respective properties from the Bibliographic Ontology. (As you don't reuse existing properties in Bibframe that means you would have to create new redundant properties if you follow this approach.)
3. For identifiers where no property exists, we recommend using dc:identifier in combination with a custom datatype. Additionally, for local and regional identifiers (e.g. German National Bibliography ID, regional IDs from the different German Library Networks) we encourage a decentral creation of new properties by the respective institution that mints the IDs in the first place.

As you can see, we both recommend the use of properties - if they exist - and the use of custom datatypes. I tried to avoid the recommendation of custom datatypes but wouldn't prevail. Hopefully, LoC won't start creating new datatypes but new properties, if at all.

All the best
Adrian

[1] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary

[2] https://wiki.dnb.de/x/TILvAw

[3]http://patterns.dataincubator.org/book/custom-datatype.html

[4] Jeff Young wrote 2011 on W3C's public-lld mailing list: "OCLC numbers identify bibliographic records, not manifestations". URL: http://lists.w3.org/Archives/Public/public-lld/2011Mar/0163.html

[5] The LCCN permalinks FAQs read: "LCCN Permalinks are persistent URLs for bibliographic records in the Library of Congress Online Catalog and authority records in Library of Congress Authorities. These links are constructed using the record's LCCN (or Library of Congress Control Number), an identifier assigned by the Library of Congress to bibliographic and authority records." URL: http://lccn.loc.gov/lccnperm-faq.html#n1

[6] https://wiki1.hbz-nrw.de/download/attachments/2328255/Biblio-Identifier.png

[7] http://www.bibsonomy.org/user/acka47/owl%3AsameAs


Adrian Pohl
- Linked Open Data -
hbz - Hochschulbibliothekszentrum des Landes NRW
Tel: (+49)(0)221 - 400 75 235
http://www.hbz-nrw.de



>>> On 1.8.2013 at 20:12, "Trail, Nate" <[log in to unmask]> wrote: 
> All,
> 
> We're thinking about modeling identifiers (and other properties?) in two 
> ways:
> 
> 1) generic property with a more specific data type:
> 
>                 bf:identifer  "9780394856308"^^http://example/org/isbn13
> 
> or
> 
> 2) specific property:
>                                                bf:isbn13 "9780394856308"
> 
> where 'bf:isbn' is a subproperty of 'bf:identifier'.
> 
> How does the community feel about these two options, and why?
> 
> Thanks,
> 
> Nate
> 
> -------------------------------------------
> Nate Trail
> -------------------------------------------
> LS/TECH/NDMSO
> Library of Congress
> 202-707-2193
> [log in to unmask]<mailto:[log in to unmask]>