LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


BIBFRAME@LISTSERV.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  November 2014

BIBFRAME November 2014

Subject:

Re: Closed and Open Assumptions was [BIBFRAME] [Topic] Types

From:

"[log in to unmask]" <[log in to unmask]>

Reply-To:

Bibliographic Framework Transition Initiative Forum <[log in to unmask]>

Date:

Fri, 7 Nov 2014 14:53:31 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (292 lines)

My point has not changed at all. It has always been that requiring inference in Bibframe applications would be a mistake. I'm not sure how you understood that to mean anything about "requiring explicit types". Again, the question about validation is interesting, but not in any way to my point. Whether or not validation is brought into play in a given application, the Bibframe vocabulary itself should use good RDF practices.

I agree that use cases would be helpful here, but there seems to me to be a more fundamental problem; that Bibframe is not entirely sure about who its users actually are. The Bibframe site itself has "In addition to being a replacement for MARC, BIBFRAME serves as a general model for expressing and connecting bibliographic data." I would take this to mean that Bibframe is essentially an inward-facing project of the cataloging community, because by and large, most consumers of bibliographic data do not have much interest themselves in expressing bibliographic data in RDF. In that case, there is little point in worrying about how to make good Linked Data. Bibframe could use whatever the cataloging community likes of the Semantic Web technologies and ignore what doesn't seem comfortable, because the only people who will use it are catalogers and library technologists (much like MARC, today).

If, on the other hand, it is an intention of the project to publish bibliographic data into the wider Web, as well as to serve as a replacement for MARC, it seems to me that there will always be a tension at play, and not a healthy or creative tension. After a few months of participating in the discussion on this list, I'm brought to question very strongly whether it is in fact possible to develop a technology that will fulfill both goals in a reasonable way.

---
A. Soroka
The University of Virginia Library

On Nov 7, 2014, at 2:21 PM, Karen Coyle <[log in to unmask]> wrote:

> I agree. I think your point has changed somewhat, but I like this approach better. So the insistence on requiring explicit types has now become a recommendation that relying on types for application functionality is likely to incur costs because you cannot count on the presence of explicit types. I think this is sensible, and it's a good point to arrive at through this discussion. It's worth an analysis, although once again that can only be compared to use cases, which we do not have enough of.
>
> One thing that makes me nervous about the RDF validation discussion taking place at W3C is that many of the technologies being considered base their validation triggers on types/classes in the instance data. This is fine if you have them, but it does mean that you need to design ontology to include the types that you need for validation. That is a different reason to use types than the standard RDF/RDFS/OWL purposes.
>
> kc
>
>
> On 11/7/14 9:33 AM, [log in to unmask] wrote:
>>> If you have reason to make use of data that does not, and the type makes a difference to your application, you will have to use inferencing.
>> This is exactly my point. The question is whether the type makes a difference.
>>
>> If Bibframe makes a condition on applications which process it that types makes a difference to the successful operation of the application, then either in every case where that kind of difference occurs, it can only be predicated on the appearance of an explicit type, or inferencing is required in the application. Therefore, Bibframe should _not_ make that condition for the most basic kind of operation. That means that for any set of Bibframe triples, there should be a sensible interpretation into the bibliographical universe that Bibframe purports to describe, _whether or not explicit types are present, and whether or not inferencing is available_. In some cases, it may be possible for Bibframe applications to do a _better_ interpretation if types are present or can be inferred, but it should never be the case that an application can make no sense or can only offer a bizarre or nonsensical interpretation of some Bibframe triples without typing information.
>>
>>
>> ---
>> A. Soroka
>> The University of Virginia Library
>>
>> On Nov 7, 2014, at 12:14 PM, Karen Coyle <[log in to unmask]> wrote:
>>
>>> On 11/7/14 8:29 AM, [log in to unmask] wrote:
>>>> That is very interesting, but it is not at all to the point I was making.
>>>>
>>>> The issue is not between CWA and OWA. It is whether or not an application consuming Bibframe triples will be able to operate correctly over them without using RDFS inferencing. It is not possible to "require" any given set of triples in the world, Bibframe aside, to have explicit typing, at least not in any currently widely-understood way. On the other hand, if it is not possible to interpret a set of Bibframe-using triples into a meaningful bibliographic universe without inferencing, then you _have_ required the presence of inferencing _in applications_. There is an enormous difference between requiring some condition on some set of triples (which is the interest of the groups you mention below) and requiring a particular capability from applications dealing with a particular kind of data, which is what this discussion was about.
>>> I still think you are talking closed world. No one, definitely not I, have said that one should ban the use of explicit types. But in the open world you cannot count on everyone using explicit types. If you operate in the open world, relying on explicitly defined types is going to be problematic. So I don't see what your point is. Some data (BIBFRAME and others) will have explicit type declarations. Other data will not. If you have reason to make use of data that does not, and the type makes a difference to your application, you will have to use inferencing. To me, these are just facts.
>>>
>>> kc
>>>
>>>
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>>
>>>> On Nov 7, 2014, at 11:09 AM, Karen Coyle <[log in to unmask]> wrote:
>>>>
>>>>> On 11/7/14 4:55 AM, [log in to unmask] wrote:
>>>>>>> Adding the rdf:type in instance data is convenient for data consumers, and that's fine; however note that it is a convenience, not a requirement.
>>>>>> I disagree. It is in fact very much a requirement if you would like to avoid requiring inferencing regimes for Bibframe.
>>>>> I'm getting a hint of closed-world assumption in some of this discussion. Most likely, the future library system software that hypothetically uses BIBFRAME or some other RDF ontology may use explicit typing to make those systems more efficient. And if we all contribute to some RDF/OCLC of the future, it may do the same. But in the open world of LOD that is just one giant graph, anyone can use BIBFRAME properties however they wish. (Anyone can say Anything about Anything). In that open world you cannot "require" anything beyond what you define in your ontology, which comes along not as constraints but as semantic baggage (hopefully useful baggage).
>>>>>
>>>>> I'm fine with anticipating a bibliographic closed world since it seems likely to happen, for practical reasons. If that's what we're addressing here, though, we should be clear about it, and separate the closed-world discussion from the open-world one. We should also then talk about whether that closed-world BIBFRAME that is being designed is also what will be opened to the LOD world, and how it will play in that world. My gut feeling is that use cases and requirements for that closed world could be quite different from those of the open world. So another set of use cases is: what do we anticipate today as uses for bibliographic data in the open world? I'd expect a lot of linking to diverse data, and trying to identify the same resource when it appears in different contexts (like connecting article citations to library holdings).
>>>>>
>>>>> Since RDF/RDFS/OWL do not provide constraints, just open-world based inferences, something else is needed to meet the requirements of the closed world. This is the topic of a newly formed W3C group called "Shapes" [1] and a Dublin Core RDF validation group [2]. The existing technologies that address this are SPIN, ICV, and Resource Shapes. [2] BIBFRAME can work on its own closed world design, perhaps extending the BF Profiles, and feel fairly confident that the W3C work will meet our needs. If we think it won't we can contribute our own use cases to that process. The easiest way to do that is through the Dublin Core group,[4] which is then feeding a set of cultural heritage use cases to the W3C effort (since that group is heavily business based). [Note that the DC group invited participation and use case info from BIBFRAME but did not receive a response.]
>>>>>
>>>>> I encourage anyone who can do so the sign up for the relevant mailing lists (they are open) and contribute to this work. The DC group has no limitations on who can participate (W3C requires institutional membership).
>>>>>
>>>>> kc
>>>>>
>>>>> [1] http://www.w3.org/2014/data-shapes/wiki/Main_Page
>>>>> [2] http://wiki.dublincore.org/index.php/RDF-Application-Profiles
>>>>> [3] http://www.w3.org/2012/12/rdf-val/submissions/Stardog (not a complete explanation, but covers all three)
>>>>> [4] Dc group has a database of case studies, use cases, and requirements that is still being worked on:
>>>>> http://lelystad.informatik.uni-mannheim.de/rdf-validation/
>>>>> It also has a testing environment, also in progress, where you can try out difference scenarios:
>>>>> http://purl.org/net/rdfval-demo
>>>>>
>>>>>
>>>>>> Take two applications, one RDFS-inferencing and one not. Give each a set of triples, one explicitly typed and one not explicitly typed, but typed correctly under RDFS semantics. For the first set of triples, both applications will work correctly. For the second, only the inferencing application will work correctly. If you do not want to require inferencing in Bibframe applications, you must not assume on it.
>>>>>>
>>>>>>> Yet I do not have the impression that we are thinking beyond the creation of data that, if at all possible, doesn't disrupt our MARC21 past. In the IT design world, you usually begin with what functions you wish to perform (use cases, requirements) before determining the structure of your data.
>>>>>> I couldn't agree more. Examining:
>>>>>>
>>>>>> http://bibframe.org/documentation/bibframe-usecases/
>>>>>>
>>>>>> I find 15 use cases, of which only 5 feature patrons as the user. The others feature one or more catalogers. Benefits to patrons from this effort might seem to be somewhat incidental to it.
>>>>>>
>>>>>> ---
>>>>>> A. Soroka
>>>>>> The University of Virginia Library
>>>>>>
>>>>>> On Nov 6, 2014, at 8:41 PM, Karen Coyle <[log in to unmask]> wrote:
>>>>>>
>>>>>>> Simeon, I do not feel that bf data should not have explicit typing. I simply do not see that as negating the typing provided by the RDFS function of domain. One does not override or invalidate the other. Adding the rdf:type in instance data is convenient for data consumers, and that's fine; however note that it is a convenience, not a requirement. Because we have multiple ways of providing typing, however, we have to be careful how the typing in the ontology and the typing in the instance data interact.
>>>>>>>
>>>>>>> If you do provide sub-class and sub-property relationships and domains and ranges, you cannot prevent others from using these for inferencing -- since that is the defined use for those declarations in your ontology as per the semantic web standards.
>>>>>>>
>>>>>>> All of these arguments, however, are empty without some real use cases. What is the use case behind the declaration of types? Do we anticipate particular searches that make use of them? If, as some feel, we should eschew inferencing, then what *is* the role of the type in our data, whether explicitly defined or inferred from the ontology? We talk about the technology as if it exists in some kind of virtual space. This is our data that we are talking about! What do we intend to do with it? What kind of searches (of the SPARQL kind) do we anticipate running over this data? How do we see our data interacting with data from other communities?
>>>>>>>
>>>>>>> This isn't a question to answer in a vacuum. Yet I do not have the impression that we are thinking beyond the creation of data that, if at all possible, doesn't disrupt our MARC21 past. In the IT design world, you usually begin with what functions you wish to perform (use cases, requirements) before determining the structure of your data. This has been the case for decades, so it shouldn't be a surprise today. Use cases would reveal things like:
>>>>>>>
>>>>>>> - what do we see as the workflow for data creation?
>>>>>>> - how will we share data among libraries for copy cataloging?
>>>>>>> - what uses do we anticipate for a Work (apart from an Instance)? and for an Instance?
>>>>>>> - if a user does a search on "Mark Twain" as author, what will the system provide as a response? A Work? A combined Work/Instance? What would be optimal?
>>>>>>> - reiterating Joyce Bell's comments on the editor, what role do types have in the cataloging function?
>>>>>>> - what kinds of searches do we want to do over our data?
>>>>>>> - how will our data interact with the many many millions of bibliographic descriptions on the web?
>>>>>>> - if someone does do inferencing over our data, what kinds of results do we hope that they will obtain?
>>>>>>> - ... ad infinitum...
>>>>>>>
>>>>>>> Without answers to these questions, I don't see how we can evaluate BIBFRAME as it exists today. If we don't know what needs it is responding to, how can we know if it meets any needs at all?
>>>>>>>
>>>>>>> This is system development 101, folks. I'm not asking anything out of the norm.
>>>>>>>
>>>>>>> kc
>>>>>>>
>>>>>>>
>>>>>>> On 11/6/14 4:22 PM, Simeon Warner wrote:
>>>>>>>> To me the key motivations for expressing types explicitly are to make the data easy and efficient to use. To be able to "get things of type X meeting condition Y" seems likely to be extremely common need. Why make the "things of type X" part harder than it need be? If I look through the set of use cases we came up with for the LD4L project [1], most of them have some component of "finding things of type X".
>>>>>>>>
>>>>>>>> It seems a fallacy to argue that, because some external data will require type inference to be used with bf data, the bf data should not have explicit typing.
>>>>>>>>
>>>>>>>> I think a less important but not insignificant secondary reason is that it makes the data and model easier to grok. Karen suggests this is an unhelpful crutch: "My impression is that the primary use of rdf:type is to make data creators feel like they've created a 'record structure' or graph based on the type." but I think the additional clarity of intent is useful (and such redundancy permits various sorts of checks). IMO, one of the costs/downsides of RDF is complexity/subtlety to understand (see discussions on this list to make that plain!) and so anything we can do to make this less of a problem with bf is good.
>>>>>>>>
>>>>>>>> 2 yen,
>>>>>>>> Simeon
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases
>>>>>>>>
>>>>>>>> On 11/7/14, 6:33 AM, Karen Coyle wrote:
>>>>>>>>> On 11/6/14 12:12 PM, [log in to unmask] wrote:
>>>>>>>>>>> bf:workTitle with domain=bf:Work
>>>>>>>>>>>
>>>>>>>>>>> makes these two statements equivalent, although in #2 you must first
>>>>>>>>>>> infer the type of :X from the predicate "bf:workTitle":
>>>>>>>>>>>
>>>>>>>>>>> :X a bf:Work ;
>>>>>>>>>>> bf:worktitle [blah] .
>>>>>>>>>>>
>>>>>>>>>>> :X bf:workTitle [blah] .
>>>>>>>>>>>
>>>>>>>>>>> In both cases, the type of :X is bf:Work.
>>>>>>>>>> This is predicated on the operation of an inference regime, presumably
>>>>>>>>>> RDFS or stronger. It is not true under plain RDF entailment.
>>>>>>>>>>
>>>>>>>>>> It's important to notice that assumption when it comes into play. RDF
>>>>>>>>>> processing does not normally make it, because it is expensive, the
>>>>>>>>>> expense varying with the strength of inference regime. For a strong
>>>>>>>>>> regime and for applications that require processing with strong
>>>>>>>>>> guarantees about response time, the expense can be prohibitive. It is
>>>>>>>>>> possible to make inference a requirement for Bibframe applications,
>>>>>>>>>> but I agree with Rob Sanderson: that would be a mistake. It should be
>>>>>>>>>> possible for a machine to process Bibframe without engaging such
>>>>>>>>>> machinery, and I say that even though I believe very strongly that
>>>>>>>>>> inference is the most important frontier for these technologies.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> A. Soroka
>>>>>>>>>> The University of Virginia Library
>>>>>>>>> I don't at disagree, although in other venues I am seeing use of
>>>>>>>>> inferencing, at least experimentally. But if you *do* include domains
>>>>>>>>> and ranges for the properties in your ontology, then they should not
>>>>>>>>> return inconsistencies when presented to a reasoner if someone *does*
>>>>>>>>> wish to employ inferencing. Having those defined in the ontology means
>>>>>>>>> that you support inferencing for those who wish to use it. Otherwise,
>>>>>>>>> why even include domains and ranges in your ontology?
>>>>>>>>>
>>>>>>>>> And note that BF uses rdfs and domains and ranges on some properties:
>>>>>>>>>
>>>>>>>>> <rdf:Property rdf:about="http://bibframe.org/vocab/contentCategory">
>>>>>>>>> <rdfs:domain rdf:resource="http://bibframe.org/vocab/Work"/>
>>>>>>>>> <rdfs:label>Content type</rdfs:label>
>>>>>>>>> <rdfs:range rdf:resource="http://bibframe.org/vocab/Category"/>
>>>>>>>>> <rdfs:comment>Categorization reflecting the fundamental form of
>>>>>>>>> communication in which the content is expressed and the human sense
>>>>>>>>> through which it is intended to be perceived.</rdfs:comment>
>>>>>>>>> </rdf:Property>
>>>>>>>>>
>>>>>>>>> You can't prevent anyone from using reasoning on the data. You still
>>>>>>>>> have to get it right.
>>>>>>>>>
>>>>>>>>> kc
>>>>>>>>>
>>>>>>>>>> On Nov 6, 2014, at 2:46 PM, Karen Coyle <[log in to unmask]> wrote:
>>>>>>>>>>
>>>>>>>>>>> On 11/6/14 8:00 AM, Simon Spero wrote:
>>>>>>>>>>>> On Nov 6, 2014 10:10 AM, "Karen Coyle" <[log in to unmask]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> If the bf:workTitle were of type bf:Work instead of bf:Title, you
>>>>>>>>>>>>> would get:
>>>>>>>>>>>>>
>>>>>>>>>>>>> <X> rdf:type bf:Work .
>>>>>>>>>>>>> <X> bf:workTitle _:aa .
>>>>>>>>>>>>> _:aa rdf:type bf:Work .
>>>>>>>>>>>>> _:aa bf:titleValue "Here's my title" .
>>>>>>>>>>>>> Does that clear it up?
>>>>>>>>>>>> Ah- now I think I understand-when you are talking about a property
>>>>>>>>>>>> being of a certain type, you are talking about the range of the
>>>>>>>>>>>> property, not the type of the property itself (ie the thing named
>>>>>>>>>>>> bf:workTitle. Did it clear up? :-)
>>>>>>>>>>>>
>>>>>>>>>>> No, actually, I'm talking about the domain of properties, not the
>>>>>>>>>>> range. The domain of the property asserts "instance of class" on the
>>>>>>>>>>> subject of the property relation. So
>>>>>>>>>>>
>>>>>>>>>>> bf:workTitle with domain=bf:Work
>>>>>>>>>>>
>>>>>>>>>>> makes these two statements equivalent, although in #2 you must first
>>>>>>>>>>> infer the type of :X from the predicate "bf:workTitle":
>>>>>>>>>>>
>>>>>>>>>>> :X a bf:Work ;
>>>>>>>>>>> bf:worktitle [blah] .
>>>>>>>>>>>
>>>>>>>>>>> :X bf:workTitle [blah] .
>>>>>>>>>>>
>>>>>>>>>>> In both cases, the type of :X is bf:Work.
>>>>>>>>>>>
>>>>>>>>>>> For others, perhaps, note that an subject (":X" here) can be of more
>>>>>>>>>>> than one type. So there's nothing wrong with saying:
>>>>>>>>>>>
>>>>>>>>>>> :X a bf:Work;
>>>>>>>>>>> a bf:mapType;
>>>>>>>>>>> a bf:digitalObject .
>>>>>>>>>>>
>>>>>>>>>>> if you want to do that. And those types could either be explicit ("a
>>>>>>>>>>> bf:xxx") or inferred. The latter could take advantage of something like
>>>>>>>>>>>
>>>>>>>>>>> bf:coordinates domain=mapType
>>>>>>>>>>> bf:digForm domain=digitalObject
>>>>>>>>>>>
>>>>>>>>>>> And an instance that goes like:
>>>>>>>>>>>
>>>>>>>>>>> :X a bf:Work;
>>>>>>>>>>> bf:coordinates "blah" ;
>>>>>>>>>>> bf:digForm <URI-for-PDF> .
>>>>>>>>>>>
>>>>>>>>>>> That's probably not how you'd do forms, but it's the example that
>>>>>>>>>>> came to mind.
>>>>>>>>>>>
>>>>>>>>>>> What this does mean is that you have to be careful what domains you
>>>>>>>>>>> define for your properties, because they add semantics to your
>>>>>>>>>>> subjects. (The most visible way to test this, IMexperience, is by
>>>>>>>>>>> defining classes as disjoint and then mixing properties from those
>>>>>>>>>>> classes in a single graph. Reasoners come back with an "inconsistent"
>>>>>>>>>>> conclusion, telling you your data doesn't match your ontology.)
>>>>>>>>>>>
>>>>>>>>>>> kc
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> If the range of workTitle is declared to Work, then the value of the
>>>>>>>>>>>> property as well as the subject of the property would also be an
>>>>>>>>>>>> instance of bf:Work.
>>>>>>>>>>>>
>>>>>>>>>>>> Is a goal to treat titles as Works in their own right, and to be
>>>>>>>>>>>> able to have titleValue asserted directly on X?
>>>>>>>>>>>>
>>>>>>>>>>>> Is a goal to find triples that may be reachable from instances of
>>>>>>>>>>>> Work? In that situation, SPARQL 1.1 sub queries or property paths
>>>>>>>>>>>> may do some of the work.
>>>>>>>>>>>>
>>>>>>>>>>>> Outside of SPARQL, some approaches to serving linked data return
>>>>>>>>>>>> closely related entities alongside the base object, trading off
>>>>>>>>>>>> bandwidth for latency or server load. id.loc.gov does this quite a
>>>>>>>>>>>> bit ;the work on linked data fragments looks to combine this with
>>>>>>>>>>>> client side query processing.
>>>>>>>>>>>>
>>>>>>>>>>>> Simon
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Karen Coyle
>>>>>>>>>>>
>>>>>>>>>>> [log in to unmask] http://kcoyle.net
>>>>>>>>>>>
>>>>>>>>>>> m: +1-510-435-8234
>>>>>>>>>>> skype: kcoylenet/+1-510-984-3600
>>>>>>>>>>>
>>>>>>> --
>>>>>>> Karen Coyle
>>>>>>> [log in to unmask] http://kcoyle.net
>>>>>>> m: +1-510-435-8234
>>>>>>> skype: kcoylenet/+1-510-984-3600
>>>>> --
>>>>> Karen Coyle
>>>>> [log in to unmask] http://kcoyle.net
>>>>> m: +1-510-435-8234
>>>>> skype: kcoylenet/+1-510-984-3600
>>> --
>>> Karen Coyle
>>> [log in to unmask] http://kcoyle.net
>>> m: +1-510-435-8234
>>> skype: kcoylenet/+1-510-984-3600
>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> m: +1-510-435-8234
> skype: kcoylenet/+1-510-984-3600

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
July 2011
June 2011

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager