> Blank node identifiers are essentially variables (in SPARQL they are an alternative syntax for variables)…
I dont think this is wrong, but in my experience it is more common to think of blank nodes as specifically existentially-bound variables with a scope that is the document in which they are found. So the triple:
_:x :favoriteEggType :Chicken .
makes the assertion, "There exists a thing, to which we will refer as '_:x', but only inside this document, the :favoriteEggType of which is :Chicken". If we see in the same document:
_:y a :Farmer .
_:y :hates _:x .
We can read "There exists a thing, to which we will refer as '_:y', but only inside this document, which is a :Farmer, and which :hates the thing to which, only in this document, we are referring to as '_:x'".
Or if we see:
_:x :fears _:z .
And _:z doesn't appear as the subject of a triple in our document, we can read "There is a thing that, just inside this document, we will call '_:z', and the :Weasel that inside this document we call '_:x' :fears '_:z'", and so on. Because of the lack of a unique name assumption for RDF, it is not defined whether the :Weasel :hates the :Farmer, or someone else entirely.
That is all rather awkward reading, and this is a well-known complaint about blank nodes. Existential qualifiers are useful, but in mass they can become confusing. It is like listening to a story told by someone who cannot remember anyone's name. Often RDF isn't meant for human consumption anyway, but analogous problems occur in machine processing. For example, as Thomas Berger remarked:
> In practice this matters when one wants to add or remove individual statements or subgraphs from graphs: When the graphs or subgraphs have blank nodes as their origin, you usually can't.
Often you can't because it is very difficult to calculate exactly what changes you are making in the possible interpretations of the graph. Simon Spero's example shows that: we don't, as he says, know how many :Weasels we actually have. And once we are no longer in the scope of our original document, we can only refer to a :Weasel by some kind of query. If the attributes of :Weasels don't support queries that will identify them uniquely, we more-or-less lose track of them. This is bad if, for example, we discover new information about our :Weasels and would like to record it in a useful way.
There are occasionally good reasons to use blank nodes, but here:
are some cautionary remarks about them from Richard Cyganiak, one of the editors of the RDF standards.
The University of Virginia Library
On Nov 18, 2014, at 12:06 PM, Simon Spero <[log in to unmask]> wrote:
> On Nov 18, 2014 11:13 AM, "Joseph Kiegel" <[log in to unmask]> wrote:
> > In RDF 1.1 Concepts and Abstract Syntax, section 3.4, we find: "Blank node identifiers . are always locally scoped to the file or RDF store, and are not persistent or portable identifiers for blank nodes". [...] Isn't it true, then, that blank node identifiers, which are valid at Library A, are not defined when they get to Library B? This seems like a problem.
> > Is the use of blank nodes consistent with BIBFRAME's function as a carrier?
> What the specification means is that a blank node _:x that refers to some thing in an RDF file transferred from A to B may not refer to the same thing except during in the one use of that file.
> It may not be the name of the thing in the stores at A *or* B, and if the same file is ingested twice, it could refer to two different things that happen to have the same values for the stated properties.
> Blank node identifiers are essentially variables (in SPARQL they are an alternative syntax for variables)...
> Suppose we have the following file:
> _:x rdf:type :Weasel.
> _:x :favoriteEggType :Chicken .
> This says that there is something that is a Weasel and whose favorite type of egg is Chicken.
> If we see this twice, we cannot tell how many chicken pickin Weasels we have.
> A different file could use _:x to refer to some Chicken.