Agreeing with Jörg in a slightly more long-winded way:

Creating additional predicates solely to improve query system performance seems a slippery slope -- akin to de-normalizing your relational db schema to make your SQL queries simpler. Everybody has done it, but if the project goes on long enough you usually wish you hadn't. Query engines get better and optimization / query anticipation strategies evolve. Predicates once declared are hard to deprecate. 

On the other hand, if we had no sub-properties (and no sub-classes) we'd just have RDF by itself, and that would not be enough. Domain-specific properties and classes are essential ways we instantiate shared understandings and agreements.

So I think the goal should be to justify the granularity based on the inflections and differences in meaning, not based on query performance. To me the distinction between authorityAssigner, classificationAssigner, and audienceAssigner seems weak. Do these distinctions reflect real specializations? Or did we get carried away? Certainly it's hard to imagine that the ranges of these predicates are or will become meaningfully different classes.  

I am a little more sympathetic to your bf:xxxValue differentiation example. Seems there could be a distinction made in ranges.  But unless we are specific about differentiating ranges of these predicates, it's hard to justify them. And I don't think falling back on what are likely to be transient query performance issues is good enough.

This is funny and sad at the same time.

I suggest that Bibframe predicates should not follow software that can not scale with the triples, instead, software implementers should follow the Bibframe model. If there are too many triples, an inverted index of a search engine might help. Please do not make fundamental model design choices like a vocabulary that shall last for the next 40 years dependent on the behavior of a software product that exists today. Tomorrow, software will change.


Dear Rob, all,

Thanks for this.   We here had a quick chat about this list this morning.

One of the reasons for the predicate proliferation was to address query performance.

In our experience, when we’ve loaded gobs of triples into various stores, we often experienced much improved query performance when the predicate is itself fairly distinctive.  When querying for a value of a “common” predicate, then query performance declines.

For example, changing bf:identifierValue to bf:value jumps out in this case.    So this query:

SELECT ?s { ?s bf:value “1234567890” }

Will be considerably slower than

SELECT ?s { ?s bf:identifierValue “1234567890” }

Because the first query has to potentially interrogate so many more triples.