Print

Print


The RDA properties defined in the Open Metadata Registry include both
constrained and unconstrained versions (cf.
http://metadataregistry.org/schemaprop/list/schema_id/82.html for the
latter).  Does BIBFRAME need something similar--constrained declarations
for prescriptive profiles and when intended use cases involve inferencing,
and unconstrained declarations for easy extensibility?

Stephen


On Fri, Nov 7, 2014 at 1:53 PM, [log in to unmask] <[log in to unmask]>
wrote:

> My point has not changed at all. It has always been that requiring
> inference in Bibframe applications would be a mistake. I'm not sure how you
> understood that to mean anything about "requiring explicit types". Again,
> the question about validation is interesting, but not in any way to my
> point. Whether or not validation is brought into play in a given
> application, the Bibframe vocabulary itself should use good RDF practices.
>
> I agree that use cases would be helpful here, but there seems to me to be
> a more fundamental problem; that Bibframe is not entirely sure about who
> its users actually are. The Bibframe site itself has "In addition to being
> a replacement for MARC, BIBFRAME serves as a general model for expressing
> and connecting bibliographic data." I would take this to mean that Bibframe
> is essentially an inward-facing project of the cataloging community,
> because by and large, most consumers of bibliographic data do not have much
> interest themselves in expressing bibliographic data in RDF. In that case,
> there is little point in worrying about how to make good Linked Data.
> Bibframe could use whatever the cataloging community likes of the Semantic
> Web technologies and ignore what doesn't seem comfortable, because the only
> people who will use it are catalogers and library technologists (much like
> MARC, today).
>
> If, on the other hand, it is an intention of the project to publish
> bibliographic data into the wider Web, as well as to serve as a replacement
> for MARC, it seems to me that there will always be a tension at play, and
> not a healthy or creative tension. After a few months of participating in
> the discussion on this list, I'm brought to question very strongly whether
> it is in fact possible to develop a technology that will fulfill both goals
> in a reasonable way.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> On Nov 7, 2014, at 2:21 PM, Karen Coyle <[log in to unmask]> wrote:
>
> > I agree. I think your point has changed somewhat, but I like this
> approach better. So the insistence on requiring explicit types has now
> become a recommendation that relying on types for application functionality
> is likely to incur costs because you cannot count on the presence of
> explicit types.  I think this is sensible, and it's a good point to arrive
> at through this discussion. It's worth an analysis, although once again
> that can only be compared to use cases, which we do not have enough of.
> >
> > One thing that makes me nervous about the RDF validation discussion
> taking place at W3C is that many of the technologies being considered base
> their validation triggers on types/classes in the instance data. This is
> fine if you have them, but it does mean that you need to design ontology to
> include the types that you need for validation. That is a different reason
> to use types than the standard RDF/RDFS/OWL purposes.
> >
> > kc
> >
> >
> > On 11/7/14 9:33 AM, [log in to unmask] wrote:
> >>> If you have reason to make use of data that does not, and the type
> makes a difference to your application, you will have to use inferencing.
> >> This is exactly my point. The question is whether the type makes a
> difference.
> >>
> >> If Bibframe makes a condition on applications which process it that
> types makes a difference to the successful operation of the application,
> then either in every case where that kind of difference occurs, it can only
> be predicated on the appearance of an explicit type, or inferencing is
> required in the application. Therefore, Bibframe should _not_ make that
> condition for the most basic kind of operation. That means that for any set
> of Bibframe triples, there should be a sensible interpretation into the
> bibliographical universe that Bibframe purports to describe, _whether or
> not explicit types are present, and whether or not inferencing is
> available_. In some cases, it may be possible for Bibframe applications to
> do a _better_ interpretation if types are present or can be inferred, but
> it should never be the case that an application can make no sense or can
> only offer a bizarre or nonsensical interpretation of some Bibframe triples
> without typing information.
> >>
> >>
> >> ---
> >> A. Soroka
> >> The University of Virginia Library
> >>
> >> On Nov 7, 2014, at 12:14 PM, Karen Coyle <[log in to unmask]> wrote:
> >>
> >>> On 11/7/14 8:29 AM, [log in to unmask] wrote:
> >>>> That is very interesting, but it is not at all to the point I was
> making.
> >>>>
> >>>> The issue is not between CWA and OWA. It is whether or not an
> application consuming Bibframe triples will be able to operate correctly
> over them without using RDFS inferencing. It is not possible to "require"
> any given set of triples in the world, Bibframe aside, to have explicit
> typing, at least not in any currently widely-understood way. On the other
> hand, if it is not possible to interpret a set of Bibframe-using triples
> into a meaningful bibliographic universe without inferencing, then you
> _have_ required the presence of inferencing _in applications_. There is an
> enormous difference between requiring some condition on some set of triples
> (which is the interest of the groups you mention below) and requiring a
> particular capability from applications dealing with a particular kind of
> data, which is what this discussion was about.
> >>> I still think you are talking closed world. No one, definitely not I,
> have said that one should ban the use of explicit types. But in the open
> world you cannot count on everyone using explicit types. If you operate in
> the open world, relying on explicitly defined types is going to be
> problematic. So I don't see what your point is. Some data (BIBFRAME and
> others) will have explicit type declarations. Other data will not. If you
> have reason to make use of data that does not, and the type makes a
> difference to your application, you will have to use inferencing. To me,
> these are just facts.
> >>>
> >>> kc
> >>>
> >>>
> >>>> ---
> >>>> A. Soroka
> >>>> The University of Virginia Library
> >>>>
> >>>> On Nov 7, 2014, at 11:09 AM, Karen Coyle <[log in to unmask]> wrote:
> >>>>
> >>>>> On 11/7/14 4:55 AM, [log in to unmask] wrote:
> >>>>>>> Adding the rdf:type in instance data is convenient for data
> consumers, and that's fine; however note that it is a convenience, not a
> requirement.
> >>>>>> I disagree. It is in fact very much a requirement if you would like
> to avoid requiring inferencing regimes for Bibframe.
> >>>>> I'm getting a hint of closed-world assumption in some of this
> discussion. Most likely, the future library system software that
> hypothetically uses BIBFRAME or some other RDF ontology may use explicit
> typing to make those systems more efficient. And if we all contribute to
> some RDF/OCLC of the future, it may do the same. But in the open world of
> LOD that is just one giant graph, anyone can use BIBFRAME properties
> however they wish. (Anyone can say Anything about Anything). In that open
> world you cannot "require" anything beyond what you define in your
> ontology, which comes along not as constraints but as semantic baggage
> (hopefully useful baggage).
> >>>>>
> >>>>> I'm fine with anticipating a bibliographic closed world since it
> seems likely to happen, for practical reasons. If that's what we're
> addressing here, though, we should be clear about it, and separate the
> closed-world discussion from the open-world one.  We should also then talk
> about whether that closed-world BIBFRAME that is being designed is also
> what will be opened to the LOD world, and how it will play in that world.
> My gut feeling is that use cases and requirements for that closed world
> could be quite different from those of the open world. So another set of
> use cases is: what do we anticipate today as uses for bibliographic data in
> the open world? I'd expect a lot of linking to diverse data, and trying to
> identify the same resource when it appears in different contexts (like
> connecting article citations to library holdings).
> >>>>>
> >>>>> Since RDF/RDFS/OWL do not provide constraints, just open-world based
> inferences, something else is needed to meet the requirements of the closed
> world. This is the topic of a newly formed W3C group called "Shapes" [1]
> and a Dublin Core RDF validation group [2]. The existing technologies that
> address this are SPIN, ICV, and Resource Shapes. [2] BIBFRAME can work on
> its own closed world design, perhaps extending the BF Profiles, and feel
> fairly confident that the W3C work will meet our needs. If we think it
> won't we can contribute our own use cases to that process. The easiest way
> to do that is through the Dublin Core group,[4] which is then feeding a set
> of cultural heritage use cases to the W3C effort (since that group is
> heavily business based). [Note that the DC group invited participation and
> use case info from BIBFRAME but did not receive a response.]
> >>>>>
> >>>>> I encourage anyone who can do so the sign up for the relevant
> mailing lists (they are open) and contribute to this work. The DC group has
> no limitations on who can participate (W3C requires institutional
> membership).
> >>>>>
> >>>>> kc
> >>>>>
> >>>>> [1] http://www.w3.org/2014/data-shapes/wiki/Main_Page
> >>>>> [2] http://wiki.dublincore.org/index.php/RDF-Application-Profiles
> >>>>> [3] http://www.w3.org/2012/12/rdf-val/submissions/Stardog (not a
> complete explanation, but covers all three)
> >>>>> [4] Dc group has a database of case studies, use cases, and
> requirements that is still being worked on:
> >>>>> http://lelystad.informatik.uni-mannheim.de/rdf-validation/
> >>>>> It also has a testing environment, also in progress, where you can
> try out difference scenarios:
> >>>>> http://purl.org/net/rdfval-demo
> >>>>>
> >>>>>
> >>>>>> Take two applications, one RDFS-inferencing and one not. Give each
> a set of triples, one explicitly typed and one not explicitly typed, but
> typed correctly under RDFS semantics. For the first set of triples, both
> applications will work correctly. For the second, only the inferencing
> application will work correctly. If you do not want to require inferencing
> in Bibframe applications, you must not assume on it.
> >>>>>>
> >>>>>>> Yet I do not have the impression that we are thinking beyond the
> creation of data that, if at all possible, doesn't disrupt our MARC21 past.
> In the IT design world, you usually begin with what functions you wish to
> perform (use cases, requirements) before determining the structure of your
> data.
> >>>>>> I couldn't agree more. Examining:
> >>>>>>
> >>>>>> http://bibframe.org/documentation/bibframe-usecases/
> >>>>>>
> >>>>>> I find 15 use cases, of which only 5 feature patrons as the user.
> The others feature one or more catalogers. Benefits to patrons from this
> effort might seem to be somewhat incidental to it.
> >>>>>>
> >>>>>> ---
> >>>>>> A. Soroka
> >>>>>> The University of Virginia Library
> >>>>>>
> >>>>>> On Nov 6, 2014, at 8:41 PM, Karen Coyle <[log in to unmask]> wrote:
> >>>>>>
> >>>>>>> Simeon, I do not feel that bf data should not have explicit
> typing. I simply do not see that as negating the typing provided by the
> RDFS function of domain. One does not override or invalidate the other.
> Adding the rdf:type in instance data is convenient for data consumers, and
> that's fine; however note that it is a convenience, not a requirement.
> Because we have multiple ways of providing typing, however, we have to be
> careful how the typing in the ontology and the typing in the instance data
> interact.
> >>>>>>>
> >>>>>>> If you do provide sub-class and sub-property relationships and
> domains and ranges, you cannot prevent others from using these for
> inferencing -- since that is the defined use for those declarations in your
> ontology as per the semantic web standards.
> >>>>>>>
> >>>>>>> All of these arguments, however, are empty without some real use
> cases. What is the use case behind the declaration of types? Do we
> anticipate particular searches that make use of them? If, as some feel, we
> should eschew inferencing, then what *is* the role of the type in our data,
> whether explicitly defined or inferred from the ontology? We talk about the
> technology as if it exists in some kind of virtual space. This is our data
> that we are talking about! What do we intend to do with it? What kind of
> searches (of the SPARQL kind) do we anticipate running over this data? How
> do we see our data interacting with data from other communities?
> >>>>>>>
> >>>>>>> This isn't a question to answer in a vacuum. Yet I do not have the
> impression that we are thinking beyond the creation of data that, if at all
> possible, doesn't disrupt our MARC21 past. In the IT design world, you
> usually begin with what functions you wish to perform (use cases,
> requirements) before determining the structure of your data. This has been
> the case for decades, so it shouldn't be a surprise today. Use cases would
> reveal things like:
> >>>>>>>
> >>>>>>> - what do we see as the workflow for data creation?
> >>>>>>> - how will we share data among libraries for copy cataloging?
> >>>>>>> - what uses do we anticipate for a Work (apart from an Instance)?
> and for an Instance?
> >>>>>>> - if a user does a search on "Mark Twain" as author, what will the
> system provide as a response? A Work? A combined Work/Instance? What would
> be optimal?
> >>>>>>> - reiterating Joyce Bell's comments on the editor, what role do
> types have in the cataloging function?
> >>>>>>> - what kinds of searches do we want to do over our data?
> >>>>>>> - how will our data interact with the many many millions of
> bibliographic descriptions on the web?
> >>>>>>> - if someone does do inferencing over our data, what kinds of
> results do we hope that they will obtain?
> >>>>>>> - ... ad infinitum...
> >>>>>>>
> >>>>>>> Without answers to these questions, I don't see how we can
> evaluate BIBFRAME as it exists today. If we don't know what needs it is
> responding to, how can we know if it meets any needs at all?
> >>>>>>>
> >>>>>>> This is system development 101, folks. I'm not asking anything out
> of the norm.
> >>>>>>>
> >>>>>>> kc
> >>>>>>>
> >>>>>>>
> >>>>>>> On 11/6/14 4:22 PM, Simeon Warner wrote:
> >>>>>>>> To me the key motivations for expressing types explicitly are to
> make the data easy and efficient to use. To be able to "get things of type
> X meeting condition Y" seems likely to be extremely common need. Why make
> the "things of type X" part harder than it need be? If I look through the
> set of use cases we came up with for the LD4L project [1], most of them
> have some component of "finding things of type X".
> >>>>>>>>
> >>>>>>>> It seems a fallacy to argue that, because some external data will
> require type inference to be used with bf data, the bf data should not have
> explicit typing.
> >>>>>>>>
> >>>>>>>> I think a less important but not insignificant secondary reason
> is that it makes the data and model easier to grok. Karen suggests this is
> an unhelpful crutch: "My impression is that the primary use of rdf:type is
> to make data creators feel like they've created a 'record structure' or
> graph based on the type." but I think the additional clarity of intent is
> useful (and such redundancy permits various sorts of checks). IMO, one of
> the costs/downsides of RDF is complexity/subtlety to understand (see
> discussions on this list to make that plain!) and so anything we can do to
> make this less of a problem with bf is good.
> >>>>>>>>
> >>>>>>>> 2 yen,
> >>>>>>>> Simeon
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1] https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases
> >>>>>>>>
> >>>>>>>> On 11/7/14, 6:33 AM, Karen Coyle wrote:
> >>>>>>>>> On 11/6/14 12:12 PM, [log in to unmask] wrote:
> >>>>>>>>>>> bf:workTitle with domain=bf:Work
> >>>>>>>>>>>
> >>>>>>>>>>> makes these two statements equivalent, although in #2 you must
> first
> >>>>>>>>>>> infer the type of :X from the predicate "bf:workTitle":
> >>>>>>>>>>>
> >>>>>>>>>>> :X a bf:Work ;
> >>>>>>>>>>>   bf:worktitle [blah] .
> >>>>>>>>>>>
> >>>>>>>>>>> :X bf:workTitle [blah] .
> >>>>>>>>>>>
> >>>>>>>>>>> In both cases, the type of :X is bf:Work.
> >>>>>>>>>> This is predicated on the operation of an inference regime,
> presumably
> >>>>>>>>>> RDFS or stronger. It is not true under plain RDF entailment.
> >>>>>>>>>>
> >>>>>>>>>> It's important to notice that assumption when it comes into
> play. RDF
> >>>>>>>>>> processing does not normally make it, because it is expensive,
> the
> >>>>>>>>>> expense varying with the strength of inference regime. For a
> strong
> >>>>>>>>>> regime and for applications that require processing with strong
> >>>>>>>>>> guarantees about response time, the expense can be prohibitive.
> It is
> >>>>>>>>>> possible to make inference a requirement for Bibframe
> applications,
> >>>>>>>>>> but I agree with Rob Sanderson: that would be a mistake. It
> should be
> >>>>>>>>>> possible for a machine to process Bibframe without engaging such
> >>>>>>>>>> machinery, and I say that even though I believe very strongly
> that
> >>>>>>>>>> inference is the most important frontier for these technologies.
> >>>>>>>>>>
> >>>>>>>>>> ---
> >>>>>>>>>> A. Soroka
> >>>>>>>>>> The University of Virginia Library
> >>>>>>>>> I don't at disagree, although in other venues I am seeing use of
> >>>>>>>>> inferencing, at least experimentally. But if you *do* include
> domains
> >>>>>>>>> and ranges for the properties in your ontology, then they should
> not
> >>>>>>>>> return inconsistencies when presented to a reasoner if someone
> *does*
> >>>>>>>>> wish to employ inferencing. Having those defined in the ontology
> means
> >>>>>>>>> that you support inferencing for those who wish to use it.
> Otherwise,
> >>>>>>>>> why even include domains and ranges in your ontology?
> >>>>>>>>>
> >>>>>>>>> And note that BF uses rdfs and domains and ranges on some
> properties:
> >>>>>>>>>
> >>>>>>>>>   <rdf:Property rdf:about="
> http://bibframe.org/vocab/contentCategory">
> >>>>>>>>>     <rdfs:domain rdf:resource="http://bibframe.org/vocab/Work"/>
> >>>>>>>>>     <rdfs:label>Content type</rdfs:label>
> >>>>>>>>>     <rdfs:range rdf:resource="http://bibframe.org/vocab/Category
> "/>
> >>>>>>>>>     <rdfs:comment>Categorization reflecting the fundamental form
> of
> >>>>>>>>> communication in which the content is expressed and the human
> sense
> >>>>>>>>> through which it is intended to be perceived.</rdfs:comment>
> >>>>>>>>>   </rdf:Property>
> >>>>>>>>>
> >>>>>>>>> You can't prevent anyone from using reasoning on the data. You
> still
> >>>>>>>>> have to get it right.
> >>>>>>>>>
> >>>>>>>>> kc
> >>>>>>>>>
> >>>>>>>>>> On Nov 6, 2014, at 2:46 PM, Karen Coyle <[log in to unmask]>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> On 11/6/14 8:00 AM, Simon Spero wrote:
> >>>>>>>>>>>> On Nov 6, 2014 10:10 AM, "Karen Coyle" <[log in to unmask]>
> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> If the bf:workTitle were of type bf:Work instead of
> bf:Title, you
> >>>>>>>>>>>>> would get:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> <X> rdf:type bf:Work .
> >>>>>>>>>>>>> <X> bf:workTitle _:aa .
> >>>>>>>>>>>>> _:aa rdf:type bf:Work .
> >>>>>>>>>>>>> _:aa bf:titleValue "Here's my title" .
> >>>>>>>>>>>>> Does that clear it up?
> >>>>>>>>>>>> Ah- now I think I understand-when you are talking about a
> property
> >>>>>>>>>>>> being of a certain type, you are talking about the range of
> the
> >>>>>>>>>>>> property, not the type of the property itself (ie the thing
> named
> >>>>>>>>>>>> bf:workTitle. Did it clear up? :-)
> >>>>>>>>>>>>
> >>>>>>>>>>> No, actually, I'm talking about the domain of properties, not
> the
> >>>>>>>>>>> range. The domain of the property asserts "instance of class"
> on the
> >>>>>>>>>>> subject of the property relation. So
> >>>>>>>>>>>
> >>>>>>>>>>> bf:workTitle with domain=bf:Work
> >>>>>>>>>>>
> >>>>>>>>>>> makes these two statements equivalent, although in #2 you must
> first
> >>>>>>>>>>> infer the type of :X from the predicate "bf:workTitle":
> >>>>>>>>>>>
> >>>>>>>>>>> :X a bf:Work ;
> >>>>>>>>>>>   bf:worktitle [blah] .
> >>>>>>>>>>>
> >>>>>>>>>>> :X bf:workTitle [blah] .
> >>>>>>>>>>>
> >>>>>>>>>>> In both cases, the type of :X is bf:Work.
> >>>>>>>>>>>
> >>>>>>>>>>> For others, perhaps, note that an subject (":X" here) can be
> of more
> >>>>>>>>>>> than one type. So there's nothing wrong with saying:
> >>>>>>>>>>>
> >>>>>>>>>>> :X a bf:Work;
> >>>>>>>>>>>    a bf:mapType;
> >>>>>>>>>>>    a bf:digitalObject .
> >>>>>>>>>>>
> >>>>>>>>>>> if you want to do that. And those types could either be
> explicit ("a
> >>>>>>>>>>> bf:xxx") or inferred. The latter could take advantage of
> something like
> >>>>>>>>>>>
> >>>>>>>>>>> bf:coordinates domain=mapType
> >>>>>>>>>>> bf:digForm domain=digitalObject
> >>>>>>>>>>>
> >>>>>>>>>>> And an instance that goes like:
> >>>>>>>>>>>
> >>>>>>>>>>> :X a bf:Work;
> >>>>>>>>>>>    bf:coordinates "blah" ;
> >>>>>>>>>>>    bf:digForm <URI-for-PDF> .
> >>>>>>>>>>>
> >>>>>>>>>>> That's probably not how you'd do forms, but it's the example
> that
> >>>>>>>>>>> came to mind.
> >>>>>>>>>>>
> >>>>>>>>>>> What this does mean is that you have to be careful what
> domains you
> >>>>>>>>>>> define for your properties, because they add semantics to your
> >>>>>>>>>>> subjects. (The most visible way to test this, IMexperience, is
> by
> >>>>>>>>>>> defining classes as disjoint and then mixing properties from
> those
> >>>>>>>>>>> classes in a single graph. Reasoners come back with an
> "inconsistent"
> >>>>>>>>>>> conclusion, telling you your data doesn't match your ontology.)
> >>>>>>>>>>>
> >>>>>>>>>>> kc
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> If the range of workTitle is declared to Work, then the value
> of the
> >>>>>>>>>>>> property as well as the subject of the property would also be
> an
> >>>>>>>>>>>> instance of bf:Work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is a goal to treat titles as Works in their own right, and to
> be
> >>>>>>>>>>>> able to have titleValue asserted directly on X?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is a goal to find triples that may be reachable from
> instances of
> >>>>>>>>>>>> Work? In that situation, SPARQL 1.1 sub queries or property
> paths
> >>>>>>>>>>>> may do some of the work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Outside of SPARQL, some approaches to serving linked data
> return
> >>>>>>>>>>>> closely related entities alongside the base object, trading
> off
> >>>>>>>>>>>> bandwidth for latency or server load. id.loc.gov does this
> quite a
> >>>>>>>>>>>> bit ;the work on linked data fragments looks to combine this
> with
> >>>>>>>>>>>> client side query processing.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Simon
> >>>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Karen Coyle
> >>>>>>>>>>>
> >>>>>>>>>>> [log in to unmask] http://kcoyle.net
> >>>>>>>>>>>
> >>>>>>>>>>> m: +1-510-435-8234
> >>>>>>>>>>> skype: kcoylenet/+1-510-984-3600
> >>>>>>>>>>>
> >>>>>>> --
> >>>>>>> Karen Coyle
> >>>>>>> [log in to unmask] http://kcoyle.net
> >>>>>>> m: +1-510-435-8234
> >>>>>>> skype: kcoylenet/+1-510-984-3600
> >>>>> --
> >>>>> Karen Coyle
> >>>>> [log in to unmask] http://kcoyle.net
> >>>>> m: +1-510-435-8234
> >>>>> skype: kcoylenet/+1-510-984-3600
> >>> --
> >>> Karen Coyle
> >>> [log in to unmask] http://kcoyle.net
> >>> m: +1-510-435-8234
> >>> skype: kcoylenet/+1-510-984-3600
> >
> > --
> > Karen Coyle
> > [log in to unmask] http://kcoyle.net
> > m: +1-510-435-8234
> > skype: kcoylenet/+1-510-984-3600
>



-- 
Stephen Hearn, Metadata Strategist
Data Management & Access, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428
ORCID:  0000-0002-3590-1242