To expand on my URI point...

Instead of scheme and similar categorization-by-string properties, classes and subclasses would be much more effective and understandable.

For example, to use a different but parallel case:

Currently we have:
    _:x bf:classificationNlm [ a bf:Classification ; bf:classificationValue "123" ; bf:classificationScheme "NLM" ]

This could easily be:
    _:x bf:classification [ a bf:Classification ; bf:value "123" ; bf:scheme "NLM" ]

Via the currently suggested deproliferation of predicates.

However, even better given the query optimization scenario would be:
    _:x bf:classification [ a bf:NlmClassification ; value "123" ]

And the same for Identifier subclasses.

This also avoids the problem of:
    _:x bf:classificationNlm [ a bf:Classification ; bf:classificationValue "123" ; bf:classificationScheme "SomethingElse" ]

Is it NLM by believing the predicate, or is it SomethingElse by believing the resource.  Yes, "don't do that then"... but ... "don't allow that then" would be even better, surely?


On Fri, Jul 18, 2014 at 2:44 PM, Cole, Timothy W <[log in to unmask]> wrote:
Agreeing with Jörg in a slightly more long-winded way:

Creating additional predicates solely to improve query system performance seems a slippery slope -- akin to de-normalizing your relational db schema to make your SQL queries simpler. Everybody has done it, but if the project goes on long enough you usually wish you hadn't. Query engines get better and optimization / query anticipation strategies evolve. Predicates once declared are hard to deprecate.

On the other hand, if we had no sub-properties (and no sub-classes) we'd just have RDF by itself, and that would not be enough. Domain-specific properties and classes are essential ways we instantiate shared understandings and agreements.

So I think the goal should be to justify the granularity based on the inflections and differences in meaning, not based on query performance. To me the distinction between authorityAssigner, classificationAssigner, and audienceAssigner seems weak. Do these distinctions reflect real specializations? Or did we get carried away? Certainly it's hard to imagine that the ranges of these predicates are or will become meaningfully different classes.

I am a little more sympathetic to your bf:xxxValue differentiation example. Seems there could be a distinction made in ranges.  But unless we are specific about differentiating ranges of these predicates, it's hard to justify them. And I don't think falling back on what are likely to be transient query performance issues is good enough.

-Tim Cole
University of Illinois at UC
From: Bibliographic Framework Transition Initiative Forum [[log in to unmask]] on behalf of [log in to unmask] [[log in to unmask]]
Sent: Friday, July 18, 2014 16:11
To: [log in to unmask]
Subject: Re: [BIBFRAME] Deproliferation of Predicates

This is funny and sad at the same time.

I suggest that Bibframe predicates should not follow software that can not scale with the triples, instead, software implementers should follow the Bibframe model. If there are too many triples, an inverted index of a search engine might help. Please do not make fundamental model design choices like a vocabulary that shall last for the next 40 years dependent on the behavior of a software product that exists today. Tomorrow, software will change.


On Fri, Jul 18, 2014 at 10:45 PM, Ford, Kevin <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Dear Rob, all,

Thanks for this.   We here had a quick chat about this list this morning.

One of the reasons for the predicate proliferation was to address query performance.

In our experience, when we’ve loaded gobs of triples into various stores, we often experienced much improved query performance when the predicate is itself fairly distinctive.  When querying for a value of a “common” predicate, then query performance declines.

For example, changing bf:identifierValue to bf:value jumps out in this case.    So this query:

SELECT ?s { ?s bf:value “1234567890” }

Will be considerably slower than

SELECT ?s { ?s bf:identifierValue “1234567890” }

Because the first query has to potentially interrogate so many more triples.

Rob Sanderson
Technology Collaboration Facilitator
Digital Library Systems and Services
Stanford, CA 94305