Print

Print



Hi Karen, Simeon,

On Thu, Nov 6, 2014 at 5:41 PM, Karen Coyle <[log in to unmask]> wrote:
Simeon, I do not feel that bf data should not have explicit typing. I simply do not see that as negating the typing provided by the RDFS function of domain.

I don't think anyone has said that it does.  Just that if you don't declare the type explicitly, you're relying on the client to do the inference... and that's not always going to happen.
 
One does not override or invalidate the other. Adding the rdf:type in instance data is convenient for data consumers, and that's fine; however note that it is a convenience, not a requirement.

I also don't think anyone said it was a requirement, just a best practice.
  
If you do provide sub-class and sub-property relationships and domains and ranges, you cannot prevent others from using these for inferencing -- since that is the defined use for those declarations in your ontology as per the semantic web standards.

Nor this.
 
All of these arguments, however, are empty without some real use cases. What is the use case behind the declaration of types?

To provide information about the resource in its own context, rather than relying on the property referencing it from another resource to imply additional information about it.

For example:

<w> a Work ;
  bf:issn <i> .

<i> a Identifier ;
  rdf:value "1234567890" .

If I dereference <i>, I have no idea that the resource is an ISSN.

Thus we need to include that information associated with the resource.  That leads to identifierScheme ... which is a URI.
So now we have:

<i> a Identifier ;
  scheme <http://.../issn/> ;
  value "1234567890" .

But an ISSN is a *type* of Identifier, and redundant with the class bf:Identifier.  There can be no resource which has a scheme of <issn> which is not an Issn Identifier.

Thus we simplify the model to:

<i> a IssnIdentifier ;
  value "1234567890" .

QED?
 
Do we anticipate particular searches that make use of them?

How about, off the top of my head:
* Associate a prefLabel with each of the classes for display. 
* Provide additional linking or information based on the class
* Provide styling or ordering based on the class, due to local preferences
* Do validation on the value to ensure that it's a legal instance of a <class> value
* etc.
 
If, as some feel, we should eschew inferencing, then what *is* the role of the type in our data, whether explicitly defined or inferred from the ontology? We talk about the technology as if it exists in some kind of virtual space. This is our data that we are talking about! What do we intend to do with it? What kind of searches (of the SPARQL kind) do we anticipate running over this data? How do we see our data interacting with data from other communities?

How about starting with what we can do now with existing data?

But to call out the interaction with other communities, unless we're clear with our models and they're comprehensible without 20 years cataloguing experience, the interaction is going to be that other communities ignore it completely.  Let's fix the vocabulary first.
 
This isn't a question to answer in a vacuum. Yet I do not have the impression that we are thinking beyond the creation of data that, if at all possible, doesn't disrupt our MARC21 past.

Disagree. For example:
   https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases

 
Without answers to these questions, I don't see how we can evaluate BIBFRAME as it exists today. If we don't know what needs it is responding to, how can we know if it meets any needs at all?
This is system development 101, folks. I'm not asking anything out of the norm.

Agreed... how about you start a thread to gather use cases, starting with some of your own? :)

Rob

 
On 11/6/14 4:22 PM, Simeon Warner wrote:
To me the key motivations for expressing types explicitly are to make the data easy and efficient to use. To be able to "get things of type X meeting condition Y" seems likely to be extremely common need. Why make the "things of type X" part harder than it need be? If I look through the set of use cases we came up with for the LD4L project [1], most of them have some component of "finding things of type X".

It seems a fallacy to argue that, because some external data will require type inference to be used with bf data, the bf data should not have explicit typing.

I think a less important but not insignificant secondary reason is that it makes the data and model easier to grok. Karen suggests this is an unhelpful crutch: "My impression is that the primary use of rdf:type is to make data creators feel like they've created a 'record structure' or graph based on the type." but I think the additional clarity of intent is useful (and such redundancy permits various sorts of checks). IMO, one of the costs/downsides of RDF is complexity/subtlety to understand (see discussions on this list to make that plain!) and so anything we can do to make this less of a problem with bf is good.

2 yen,
Simeon


[1] https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases

On 11/7/14, 6:33 AM, Karen Coyle wrote:
On 11/6/14 12:12 PM, [log in to unmask] wrote:
bf:workTitle with domain=bf:Work

makes these two statements equivalent, although in #2 you must first
infer the type of :X from the predicate "bf:workTitle":

:X a bf:Work ;
   bf:worktitle [blah] .

:X bf:workTitle [blah] .

In both cases, the type of :X is bf:Work.

This is predicated on the operation of an inference regime, presumably
RDFS or stronger. It is not true under plain RDF entailment.

It's important to notice that assumption when it comes into play. RDF
processing does not normally make it, because it is expensive, the
expense varying with the strength of inference regime. For a strong
regime and for applications that require processing with strong
guarantees about response time, the expense can be prohibitive. It is
possible to make inference a requirement for Bibframe applications,
but I agree with Rob Sanderson: that would be a mistake. It should be
possible for a machine to process Bibframe without engaging such
machinery, and I say that even though I believe very strongly that
inference is the most important frontier for these technologies.

---
A. Soroka
The University of Virginia Library

I don't at disagree, although in other venues I am seeing use of
inferencing, at least experimentally. But if you *do* include domains
and ranges for the properties in your ontology, then they should not
return inconsistencies when presented to a reasoner if someone *does*
wish to employ inferencing. Having those defined in the ontology means
that you support inferencing for those who wish to use it. Otherwise,
why even include domains and ranges in your ontology?

And note that BF uses rdfs and domains and ranges on some properties:

   <rdf:Property rdf:about="http://bibframe.org/vocab/contentCategory">
     <rdfs:domain rdf:resource="http://bibframe.org/vocab/Work"/>
     <rdfs:label>Content type</rdfs:label>
     <rdfs:range rdf:resource="http://bibframe.org/vocab/Category"/>
     <rdfs:comment>Categorization reflecting the fundamental form of
communication in which the content is expressed and the human sense
through which it is intended to be perceived.</rdfs:comment>
   </rdf:Property>

You can't prevent anyone from using reasoning on the data. You still
have to get it right.

kc


On Nov 6, 2014, at 2:46 PM, Karen Coyle <[log in to unmask]> wrote:

On 11/6/14 8:00 AM, Simon Spero wrote:
On Nov 6, 2014 10:10 AM, "Karen Coyle" <[log in to unmask]> wrote:

If the bf:workTitle were of type bf:Work instead of bf:Title, you
would get:

<X> rdf:type bf:Work .
<X> bf:workTitle _:aa .
_:aa rdf:type bf:Work .
_:aa bf:titleValue "Here's my title" .
Does that clear it up?
Ah- now I think I understand-when you are talking about a property
being of a certain type, you are talking about the range of the
property, not the type of the property itself (ie the thing named
bf:workTitle. Did it clear up? :-)

No, actually, I'm talking about the domain of properties, not the
range. The domain of the property asserts "instance of class" on the
subject of the property relation. So

bf:workTitle with domain=bf:Work

makes these two statements equivalent, although in #2 you must first
infer the type of :X from the predicate "bf:workTitle":

:X a bf:Work ;
   bf:worktitle [blah] .

:X bf:workTitle [blah] .

In both cases, the type of :X is bf:Work.

For others, perhaps, note that an subject (":X" here) can be of more
than one type. So there's nothing wrong with saying:

:X a bf:Work;
    a bf:mapType;
    a bf:digitalObject .

if you want to do that. And those types could either be explicit ("a
bf:xxx") or inferred. The latter could take advantage of something like

bf:coordinates domain=mapType
bf:digForm domain=digitalObject

And an instance that goes like:

:X a bf:Work;
    bf:coordinates "blah" ;
    bf:digForm <URI-for-PDF> .

That's probably not how you'd do forms, but it's the example that
came to mind.

What this does mean is that you have to be careful what domains you
define for your properties, because they add semantics to your
subjects. (The most visible way to test this, IMexperience, is by
defining classes as disjoint and then mixing properties from those
classes in a single graph. Reasoners come back with an "inconsistent"
conclusion, telling you your data doesn't match your ontology.)

kc



If the range of workTitle is declared to Work, then the value of the
property as well as the subject of the property would also be an
instance of bf:Work.

Is a goal to treat titles as Works in their own right, and to be
able to have titleValue asserted directly on X?

Is a goal to find triples that may be reachable from instances of
Work? In that situation, SPARQL 1.1 sub queries or property paths
may do some of the work.

Outside of SPARQL, some approaches to serving linked data return
closely related entities alongside the base object, trading off
bandwidth for latency or server load. id.loc.gov does this quite a
bit ;the work on linked data fragments looks to combine this with
client side query processing.

Simon

--
Karen Coyle

[log in to unmask] http://kcoyle.net

m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600



--
Karen Coyle
[log in to unmask] http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600



--
Rob Sanderson
Technology Collaboration Facilitator
Digital Library Systems and Services
Stanford, CA 94305