Simeon, I do not feel that bf data should not have explicit typing. I
simply do not see that as negating the typing provided by the RDFS
function of domain. One does not override or invalidate the other.
Adding the rdf:type in instance data is convenient for data consumers,
and that's fine; however note that it is a convenience, not a
requirement. Because we have multiple ways of providing typing, however,
we have to be careful how the typing in the ontology and the typing in
the instance data interact.
If you do provide sub-class and sub-property relationships and domains
and ranges, you cannot prevent others from using these for inferencing
-- since that is the defined use for those declarations in your ontology
as per the semantic web standards.
All of these arguments, however, are empty without some real use cases.
What is the use case behind the declaration of types? Do we anticipate
particular searches that make use of them? If, as some feel, we should
eschew inferencing, then what *is* the role of the type in our data,
whether explicitly defined or inferred from the ontology? We talk about
the technology as if it exists in some kind of virtual space. This is
our data that we are talking about! What do we intend to do with it?
What kind of searches (of the SPARQL kind) do we anticipate running over
this data? How do we see our data interacting with data from other
communities?
This isn't a question to answer in a vacuum. Yet I do not have the
impression that we are thinking beyond the creation of data that, if at
all possible, doesn't disrupt our MARC21 past. In the IT design world,
you usually begin with what functions you wish to perform (use cases,
requirements) before determining the structure of your data. This has
been the case for decades, so it shouldn't be a surprise today. Use
cases would reveal things like:
- what do we see as the workflow for data creation?
- how will we share data among libraries for copy cataloging?
- what uses do we anticipate for a Work (apart from an Instance)? and
for an Instance?
- if a user does a search on "Mark Twain" as author, what will the
system provide as a response? A Work? A combined Work/Instance? What
would be optimal?
- reiterating Joyce Bell's comments on the editor, what role do types
have in the cataloging function?
- what kinds of searches do we want to do over our data?
- how will our data interact with the many many millions of
bibliographic descriptions on the web?
- if someone does do inferencing over our data, what kinds of results do
we hope that they will obtain?
- ... ad infinitum...
Without answers to these questions, I don't see how we can evaluate
BIBFRAME as it exists today. If we don't know what needs it is
responding to, how can we know if it meets any needs at all?
This is system development 101, folks. I'm not asking anything out of
the norm.
kc
On 11/6/14 4:22 PM, Simeon Warner wrote:
> To me the key motivations for expressing types explicitly are to make
> the data easy and efficient to use. To be able to "get things of type
> X meeting condition Y" seems likely to be extremely common need. Why
> make the "things of type X" part harder than it need be? If I look
> through the set of use cases we came up with for the LD4L project [1],
> most of them have some component of "finding things of type X".
>
> It seems a fallacy to argue that, because some external data will
> require type inference to be used with bf data, the bf data should not
> have explicit typing.
>
> I think a less important but not insignificant secondary reason is
> that it makes the data and model easier to grok. Karen suggests this
> is an unhelpful crutch: "My impression is that the primary use of
> rdf:type is to make data creators feel like they've created a 'record
> structure' or graph based on the type." but I think the additional
> clarity of intent is useful (and such redundancy permits various sorts
> of checks). IMO, one of the costs/downsides of RDF is
> complexity/subtlety to understand (see discussions on this list to
> make that plain!) and so anything we can do to make this less of a
> problem with bf is good.
>
> 2 yen,
> Simeon
>
>
> [1] https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases
>
> On 11/7/14, 6:33 AM, Karen Coyle wrote:
>> On 11/6/14 12:12 PM, [log in to unmask] wrote:
>>>> bf:workTitle with domain=bf:Work
>>>>
>>>> makes these two statements equivalent, although in #2 you must first
>>>> infer the type of :X from the predicate "bf:workTitle":
>>>>
>>>> :X a bf:Work ;
>>>> bf:worktitle [blah] .
>>>>
>>>> :X bf:workTitle [blah] .
>>>>
>>>> In both cases, the type of :X is bf:Work.
>>>
>>> This is predicated on the operation of an inference regime, presumably
>>> RDFS or stronger. It is not true under plain RDF entailment.
>>>
>>> It's important to notice that assumption when it comes into play. RDF
>>> processing does not normally make it, because it is expensive, the
>>> expense varying with the strength of inference regime. For a strong
>>> regime and for applications that require processing with strong
>>> guarantees about response time, the expense can be prohibitive. It is
>>> possible to make inference a requirement for Bibframe applications,
>>> but I agree with Rob Sanderson: that would be a mistake. It should be
>>> possible for a machine to process Bibframe without engaging such
>>> machinery, and I say that even though I believe very strongly that
>>> inference is the most important frontier for these technologies.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>
>> I don't at disagree, although in other venues I am seeing use of
>> inferencing, at least experimentally. But if you *do* include domains
>> and ranges for the properties in your ontology, then they should not
>> return inconsistencies when presented to a reasoner if someone *does*
>> wish to employ inferencing. Having those defined in the ontology means
>> that you support inferencing for those who wish to use it. Otherwise,
>> why even include domains and ranges in your ontology?
>>
>> And note that BF uses rdfs and domains and ranges on some properties:
>>
>> <rdf:Property rdf:about="http://bibframe.org/vocab/contentCategory">
>> <rdfs:domain rdf:resource="http://bibframe.org/vocab/Work"/>
>> <rdfs:label>Content type</rdfs:label>
>> <rdfs:range rdf:resource="http://bibframe.org/vocab/Category"/>
>> <rdfs:comment>Categorization reflecting the fundamental form of
>> communication in which the content is expressed and the human sense
>> through which it is intended to be perceived.</rdfs:comment>
>> </rdf:Property>
>>
>> You can't prevent anyone from using reasoning on the data. You still
>> have to get it right.
>>
>> kc
>>
>>>
>>> On Nov 6, 2014, at 2:46 PM, Karen Coyle <[log in to unmask]> wrote:
>>>
>>>> On 11/6/14 8:00 AM, Simon Spero wrote:
>>>>> On Nov 6, 2014 10:10 AM, "Karen Coyle" <[log in to unmask]> wrote:
>>>>>
>>>>>> If the bf:workTitle were of type bf:Work instead of bf:Title, you
>>>>>> would get:
>>>>>>
>>>>>> <X> rdf:type bf:Work .
>>>>>> <X> bf:workTitle _:aa .
>>>>>> _:aa rdf:type bf:Work .
>>>>>> _:aa bf:titleValue "Here's my title" .
>>>>>> Does that clear it up?
>>>>> Ah- now I think I understand-when you are talking about a property
>>>>> being of a certain type, you are talking about the range of the
>>>>> property, not the type of the property itself (ie the thing named
>>>>> bf:workTitle. Did it clear up? :-)
>>>>>
>>>> No, actually, I'm talking about the domain of properties, not the
>>>> range. The domain of the property asserts "instance of class" on the
>>>> subject of the property relation. So
>>>>
>>>> bf:workTitle with domain=bf:Work
>>>>
>>>> makes these two statements equivalent, although in #2 you must first
>>>> infer the type of :X from the predicate "bf:workTitle":
>>>>
>>>> :X a bf:Work ;
>>>> bf:worktitle [blah] .
>>>>
>>>> :X bf:workTitle [blah] .
>>>>
>>>> In both cases, the type of :X is bf:Work.
>>>>
>>>> For others, perhaps, note that an subject (":X" here) can be of more
>>>> than one type. So there's nothing wrong with saying:
>>>>
>>>> :X a bf:Work;
>>>> a bf:mapType;
>>>> a bf:digitalObject .
>>>>
>>>> if you want to do that. And those types could either be explicit ("a
>>>> bf:xxx") or inferred. The latter could take advantage of something
>>>> like
>>>>
>>>> bf:coordinates domain=mapType
>>>> bf:digForm domain=digitalObject
>>>>
>>>> And an instance that goes like:
>>>>
>>>> :X a bf:Work;
>>>> bf:coordinates "blah" ;
>>>> bf:digForm <URI-for-PDF> .
>>>>
>>>> That's probably not how you'd do forms, but it's the example that
>>>> came to mind.
>>>>
>>>> What this does mean is that you have to be careful what domains you
>>>> define for your properties, because they add semantics to your
>>>> subjects. (The most visible way to test this, IMexperience, is by
>>>> defining classes as disjoint and then mixing properties from those
>>>> classes in a single graph. Reasoners come back with an "inconsistent"
>>>> conclusion, telling you your data doesn't match your ontology.)
>>>>
>>>> kc
>>>>
>>>>
>>>>
>>>>> If the range of workTitle is declared to Work, then the value of the
>>>>> property as well as the subject of the property would also be an
>>>>> instance of bf:Work.
>>>>>
>>>>> Is a goal to treat titles as Works in their own right, and to be
>>>>> able to have titleValue asserted directly on X?
>>>>>
>>>>> Is a goal to find triples that may be reachable from instances of
>>>>> Work? In that situation, SPARQL 1.1 sub queries or property paths
>>>>> may do some of the work.
>>>>>
>>>>> Outside of SPARQL, some approaches to serving linked data return
>>>>> closely related entities alongside the base object, trading off
>>>>> bandwidth for latency or server load. id.loc.gov does this quite a
>>>>> bit ;the work on linked data fragments looks to combine this with
>>>>> client side query processing.
>>>>>
>>>>> Simon
>>>>>
>>>> --
>>>> Karen Coyle
>>>>
>>>> [log in to unmask] http://kcoyle.net
>>>>
>>>> m: +1-510-435-8234
>>>> skype: kcoylenet/+1-510-984-3600
>>>>
>>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600
|