Print

Print


Karen,

you can use constraints in RDF. Do not put the blame on RDF, it is just a
modeling language. RDF is not per se just for inferencing new facts but it
can also be instrumented by rules that can be interpreted for restricting
classes to domains, checking properties for valid values and so on. It
depends on how rules are interpreted and the reasoner works.

Here is an example product for implementing integrity constraints on RDF
(maybe there are more):

http://docs.stardog.com/#_validating_constraints

The operations over library data I expect are pretty clear to me, they were
formulated in the Paris Principles in 1961

"Functions of the Catalogue
The catalogue should be an efficient instrument for ascertaining
2.1 whether the library contains a particular book specified by
(a) its author and title, or
(b) if the author is not named in the book, its title alone, or
(c) if author and title are inappropriate or insufficient for
identification, a suitable
substitute for the title; and
2.2 (a) which works by a particular author and
 (b) which editions of a particular work are in the library."

In the meantime, they have been reformulated as International Cataloguing
Principles (ICP) in
http://www.ifla.org/files/assets/cataloguing/icp/icp_2009-en.pdf

With SPARQL, RDF stores can be queried. For tasks of document-based
information retrieval, SPARQL complicates the matter for users and
implementors and is also not very efficient, for example performing GROUP
BY or ORDER BY. Also, I am not interested in presenting triples to patrons,
I want to present documents with answers according to the Paris Principles
/ ICP. Therefore, I load RDF as JSON-LD into Elasticsearch for faster and
more convenient document retrieval with filters, rankings, and
aggregations, which offer more powerful solutions to requirements such as

- when queried, the catalogue should display a complete result set with
most relevant documents first
- the catalogue should allow to display all relevant works grouped by
authors and all relevant editions grouped by work
- the catalogue should allow to reorder, page, refine, and extend result
sets by simple operations by the user
- a union catalogue should ascertain the complete list of libraries that
hold a particular edition of a work and display the services offered by the
library for that item

The model of RDF triples is nevertheless important, it provides better
maintainability and portability of information sets on the Web (e.g.
linking to other catalogues, export/import, incremental loads, merging
catalogues into a union catalogue, referencing to other data sets like
research data), therefore it promises to reduce costs massively (which
still has to be proven by calculations). But that is nothing to concern a
user with who wants to search the catalogue.

Jörg

On Wed, Jan 7, 2015 at 8:13 PM, Karen Coyle <[log in to unmask]> wrote:

> Joe, I like the suggestion of using classes. Some RDF-using communities
> seem to be very class-heavy, others less so. The implications of lots of
> classes vs. a few classes still isn't clear to me in terms of how it
> affects practice, but clearly classes provide functionality that we  may
> not be used to exploiting.
>
> On 1/7/15 8:49 AM, Joseph Kiegel wrote:
>
>> I agree with you that mapping BF to "constrained" (typed) RDA will be
>> necessary and useful.
>>
>> At the end of my message, I tried to make the point that this won't be
>> possible.  I used classes but it is better to use properties instead.  Once
>> you map rdam:reproductionOfManifestation to bf:reproduction and
>> rdai:reproductionOfItem to bf:reproduction, you can't go back the other
>> way. That is, bf:reproduction does not contain the information you need to
>> choose the correct RDA property in the BF -> RDA mapping.  You no longer
>> know whether you came from reproductionOfManifestation or
>> reproductionOfItem.
>>
>
> I suspect that "mapping" is not the right term here, and maybe that's the
> issue. If you look at some of the recent presentations that Gordon has
> done,[1] you see that you can create relationships between terms, e.g.
> bf:reproduction is a super-property of rdam:reproductionOfManifestation
> and rdai:reproductionOfItem. You don't change the two RDA properties to
> bf:reproduction -- they stay what they are, and you navigate the
> relationship. That doesn't entirely solve the problem, because as is always
> the case with data it is very hard to go from less specific to more
> specific. However, I go back to an earlier question, which is: what do we
> need to do with this data, and under what circumstances do these
> differences matter? For example, if you have
>
> resourceA a bf:Work .
> resourceA bf:workTitle "Moby Dick" .
> resourceA bf:creator http://..
> resource7 a rdac:Work .
> resourceA bf:language "ENG" .
> resource8 a rdac:Expression .
> resource8 rdae:language "ENG" .
> resource8 rdae:expressionOf resource3 .
> resource3 rdaw:workTitle "Moby Dick" .
> resource3 rdaw:personalCreator http://...
>
> You actually have a lot of information here. If this information exists in
> open linked data space, you can find resources that are in language ENG,
> and you have essentially the same (well, close to the same) data elements
> for the RDA and the BF descriptions, even though they are structured
> differently. In both you have access to the Work and Expression
> information. (This would be more easily explained with a diagram ;-))
>
> As Gordon says, however, there may still be differences. bf:Work may not
> be one-to-one on *all* information with rdac:Work+rdac:Expression. But
> linked data is designed to be used across heterogeneous data, and allows
> for gaps and differences. It will probably be no less precise than any
> previous mappings that we did (e.g. MARC to Dublin Core - from 1100 data
> elements to 15!).
>
> The question, therefore, is not "Can I map property1 to propertyZ" but "do
> I have the information I need?" This involves not just property definitions
> but the whole meaning provided by the graph.
>
> This describes an open world usage, and doesn't touch on the question of
> what data our library system/closed world will use. There can be a
> considerable difference between the closed world and the open world, and
> many enterprise systems (banks, medical data...) export to the open world
> data that is very different from their internal view of their data. What I
> find unclear at the moment in library-land is: what we are designing for,
> and, once again, what do we expect to do with it?
>
>
> kc
> [1] http://www.slideshare.net/GordonDunsire
>
>
>
>
>
>
>
>
>>
>> Joe
>>
>> --------------------------------------------------
>> From: "Fallgren, Nancy (NIH/NLM) [E]" <[log in to unmask]>
>> Sent: Wednesday, January 07, 2015 8:08 AM
>> To: <[log in to unmask]>
>> Subject: Re: [BIBFRAME] Constrained vs unconstrained schemas
>>
>>  Hi All,
>>>
>>> FWIW . . .
>>> We are working with the "constrained" version (with a nod to Karen's
>>> comments re use of the term 'constrained') of RDA/RDF and mapping that to a
>>> BIBFRAME core vocabulary precisely because we don't know what a cataloging
>>> input UI will look like post-MARC or how BF will be generated from that
>>> input.  Since BF and RDA have different structures, our thinking is to use
>>> the "constrained" RDA/RDF so that the RDA data can be reconstructed easily
>>> and losslessly back into its WEMI entities structure from BF should that
>>> prove useful or necessary.
>>>
>>> -Nancy
>>>
>>> Nancy J. Fallgren
>>> Metadata Specialist Librarian
>>> Cataloging and Metadata Management Section
>>> Technical Services Division
>>> National Library of Medicine
>>>
>>> [log in to unmask]
>>>
>>> -----Original Message-----
>>> From: Gordon Dunsire [mailto:[log in to unmask]]
>>> Sent: Tuesday, January 06, 2015 7:42 AM
>>> To: [log in to unmask]
>>> Subject: Re: [BIBFRAME] Constrained vs unconstrained schemas
>>>
>>> All
>>>
>>> Many applications based on RDF data will need to know what type of thing
>>> is being described by a triple. An application can get that information
>>> implicitly, from the domain and range of the triple's property, or
>>> explicitly, from a separate triple stating the thing's type. There is no
>>> guarantee that such a type triple exists, or is connected to the local
>>> graph, or can be retrieved from the global graph.
>>>
>>> The quality (effectiveness, efficiency, etc.) of these applications is
>>> likely to depend on the accuracy and completeness of entity typing. More
>>> sophisticated applications are likely to depend also on the semantic
>>> coherence of the results of typing.
>>>
>>> Publishers of data based on specific ontologies should be able to choose
>>> whether to provide type triples implicitly or explicitly. Using properties
>>> constrained by domain and range allows implicit typing by applications
>>> intended to consume the data. The maintainers of the specific ontology are
>>> probably the best agents to provide data publishers and consumers with the
>>> RDF element sets for the constrained properties and, indeed, the type
>>> classes used to constrain them.
>>>
>>> Publishing data using constrained properties does not prevent its use by
>>> applications that are simple, low-quality, or do not require entity typing.
>>> Such applications may use RDF maps to dumb-down constrained properties
>>> to unconstrained versions, or simply ignore domains and ranges. The RDF
>>> maps may be local to the application, or provided by the maintainers of the
>>> constrained elements or some other agent.
>>>
>>> I agree that the publishers of library data in RDF should be able to
>>> specify how it is intended to be used by libraries: this is a closed-world
>>> assumption. The BF model seems to be mainly influenced by the data
>>> currently used by library applications based on MARC21; the FRBR model
>>> reflects the functional requirements to support world-wide consensus on
>>> user tasks. I think both of these bases, data and users, are good
>>> indicators of the needs of future library applications. I therefore think
>>> it is a benefit that the BIBFRAME Initiative (BFI), IFLA, and the JSC for
>>> RDA are providing constrained RDF element sets for BF, FRBR, ISBD, and RDA.
>>> I also think the provision of unconstrained element sets is a good thing,
>>> together with mappings from constrained to unconstrained properties. I do
>>> not know whether BFI intends to publish unconstrained properties. I do know
>>> that the FRBR Review Group decided not to do so because of its plans to
>>> consolidate the FRBR, FRAD, and FRSAD models (now approaching completion),
>>> and that the ISBD Review Group has an unconstrained element set ready for
>>> publication in the near future with a corresponding map.
>>>
>>> The JSC and ISBD Review Group have collaborated on a map between the
>>> ISBD and RDA elements [1]. The map, based on an updated version of the
>>> agreed element alignment [2] will be published in the next few weeks. It
>>> necessarily uses unconstrained properties to link well-formed ISBD and RDA
>>> data together, and was a stimulus to the development of the unconstrained
>>> ISBD element set. As noted in the pre-print cited by Karen, there is also a
>>> map between ISBD and FRBR classes which requires local semantics for
>>> "aspect" relationships [3].
>>>
>>> I am not convinced that the assumption that RDA Work and RDA Expression
>>> are equivalent to/same as BF Work is a useful or valid one [4]. I think
>>> there may be similar problems with RDA Manifestation, RDA Item, and BF
>>> Instance.
>>> The ISBD/RDA experience shows that careful consideration of implicit
>>> semantics in definitions and scope notes is required, as well as explicit
>>> semantics in domain, range, and sub-property relationships.
>>>
>>> So I do not advise mapping either the constrained or unconstrained RDA
>>> properties to constrained BF properties without further clarification of
>>> the class relationships. It is ok to map constrained BF properties to
>>> unconstrained RDA properties. A full map between RDA and BF requires the
>>> use of unconstrained RDA and BF properties. And, by definition, a roundtrip
>>> from constrained to unconstrained to constrained is somewhat lossy (as well
>>> as incoherent).
>>>
>>> I think we need further investigation of the relationship between the
>>> RDA/FRBR models and BF, probably best carried out by the JSC and BFI. And
>>> we need to test interoperability using orthodox RDA and BF data.
>>> Fortunately, we now have the beta of version 3 of RIMMF to create orthodox
>>> RDA data [5].
>>> So perhaps we can do something useful with RDA and BF data after the
>>> Jane-athon [6].
>>>
>>> Cheers
>>>
>>> Gordon
>>>
>>> [1] http://www.rda-jsc.org/docs/6JSC-Chair-4.pdf
>>> [2]
>>> http://www.ifla.org/files/assets/cataloguing/isbd/
>>> OtherDocumentation/ISBD2RD
>>> A%20Alignment%20v1_1.pdf
>>> [3]
>>> http://www.ifla.org/files/assets/cataloguing/isbd/
>>> OtherDocumentation/resourc
>>> e-wemi.pdf
>>> [4] http://www.gordondunsire.com/pubs/pres/RDAMARCBIBFRAME.pptx
>>> [5] http://www.rdaregistry.info/rimmf/index.html
>>> [6] http://www.rdatoolkit.org/janeathon
>>>
>>> If it is a camel, a weasel, and a whale, then it is a cloud (inferred
>>> from Hamlet, Act 3, Scene 2).
>>>
>>>
>>> -----Original Message-----
>>> From: Bibliographic Framework Transition Initiative Forum [mailto:
>>> [log in to unmask]] On Behalf Of Joseph Kiegel
>>> Sent: 05 January 2015 23:21
>>> To: [log in to unmask]
>>> Subject: Re: [BIBFRAME] Constrained vs unconstrained schemas
>>>
>>> Thanks, this helps a lot.  I had viewed domains as more restrictive than
>>> they are.
>>>
>>> I agree with your larger question that we need to understand the
>>> operations that will be performed on our data in RDF.  Perhaps we can't
>>> anticipate what other people will do, but we should be able to specify what
>>> libraries will do.
>>>
>>>
>>> Joe
>>>
>>> --------------------------------------------------
>>> From: "Karen Coyle" <[log in to unmask]>
>>> Sent: Monday, January 05, 2015 1:38 PM
>>> To: <[log in to unmask]>
>>> Subject: Re: [BIBFRAME] Constrained vs unconstrained schemas
>>>
>>>  Joseph, You might want to look at my blog post on RDF classes:
>>>>
>>>> http://kcoyle.blogspot.com/2014/11/classes-in-rdf.html
>>>>
>>>> and the article by Baker-Coyle-Petiya
>>>>
>>>> http://kcoyle.net/LHTv32n4preprint.pdf
>>>>
>>>> There are actually no "constraints" in RDF, just potential inferences.
>>>> The inferences are based on the stated domains and ranges of the
>>>>
>>> properties.
>>>
>>>> There are examples of this in the Baker et al article using RDA,
>>>> FRBRer and BIBFRAME. There is no conflict with a subject being
>>>> inferred as being an instance of more than one class as long as the
>>>> classes themselves are not declared as disjoint. (The article explains
>>>> this better than I can in an email. ) The documentation for RDA,
>>>> BIBFRAME and FRBRer all presents classes as determinants of data
>>>> structure. This, to me, is a common error in RDF development. That any
>>>> subject can be an instance of more than one class is necessary for the
>>>> RDF graph's flexibility, and should be proof that classes do not
>>>>
>>> constraint your data to a single graph structure.
>>>
>>>>
>>>> The declared domains of properties only come into play if inferencing
>>>> is applied. A big question, therefore, is whether any inferencing will
>>>> be done at all over the data. The utility of, for example, the RDA
>>>> classes to me is that it allows you to do simple queries for
>>>> categories of triples, e.g. "give me all of the work triples for the
>>>> manifestation with this ISBN." Other than that you can ignore the fact
>>>> that domains have been declared if they don't serve your needs.
>>>>
>>>> Your question, however, brings up a much larger question that I
>>>> haven't seen discussed anywhere, which is: what kinds of operations do
>>>> we expect to perform over library data in RDF? That question really
>>>> should be answered before domains and ranges are defined, because that
>>>> is the function of those capabilities of RDF.
>>>>
>>>> kc
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 1/5/15 12:52 PM, Joseph Kiegel wrote:
>>>>
>>>>> A comparison of BIBFRAME and RDA in RDF (referred to below as RDA),
>>>>> in an attempt to map RDA to BIBFRAME, raised the issue of constrained
>>>>> vs unconstrained schemas.
>>>>>
>>>>> The full set of RDA properties is constrained by the RDA classes of
>>>>> Agent, Work, Expression, Manifestation and Item.  That is, each
>>>>> property is related to a specific class when appropriate: e.g.
>>>>> abridgementOfExpression and abridgementOfWork.  A parallel set of
>>>>> properties has been created where the constraints of class are lifted:
>>>>> e.g. abridgementOf.  This unconstrained version of RDA loses the
>>>>> context of some properties but is intended to facilitate mapping to
>>>>> schemas that do not use the FRBR model underlying RDA.
>>>>>
>>>>> BIBFRAME is a constrained schema, but constrained by different classes:
>>>>> Agent, Work, and Instance.  There is no unconstrained version of
>>>>> BIBFRAME.
>>>>>
>>>>> A mapping of RDA to BIBFRAME presents choices and challenges.
>>>>>
>>>>> Is it better to use constrained RDA, which causes explicit conflicts
>>>>> of
>>>>> domain:  e.g. mapping rdam:reproductionOfManifestation to
>>>>> bf:reproduction and rdai:reproductionOfItem to bf:reproduction?
>>>>>
>>>>> Or is it better to use unconstrained RDA, which still has conflicts
>>>>> (an unconstrained domain vs a constrained one in BIBFRAME): e.g.
>>>>> mapping rdau:reproductionOf to bf:reproduction?
>>>>>
>>>>> It is not obvious which is the better choice.  Although perhaps we
>>>>> need both mappings, each with its own problems regarding original and
>>>>> destination domains.
>>>>>
>>>>> A corollary of the question is that any roundtrip RDA -> BF -> RDA is
>>>>> lossy. If constrained RDA is used as a starting point, RDA classes
>>>>> are lost in the mapping itself, and if unconstrained RDA is used,
>>>>> classes are lost prior to mapping. Either way, RDA classes cannot be
>>>>> recovered in a BF -> constrained RDA mapping.
>>>>>
>>>>>
>>>> --
>>>> Karen Coyle
>>>> [log in to unmask] http://kcoyle.net
>>>> m: +1-510-435-8234
>>>> skype: kcoylenet/+1-510-984-3600
>>>>
>>>>
>>>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> m: +1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>