It is similar in some ways. I think perhaps the essential questions are these: _who_ makes this decision about what data to trust and in what ways to trust it (for decades, it has been only library professionals, but new abilities to add last-mile services to OPACs and other discovery environments are changing that rapidly) and _when_ and _on what basis_ is it to be made (in bulk? on some more detailed scale?).

There are several levels of granularity at which that decision might be made. If and when a large quantity of data has a clear provenance (like a Wikidata dump) then the decisions can be made at a coarse level. When it doesn't, when we deal with data sourced from intricate constructions like archival descriptions composed over decades or wildly varying constructions like the metadata self-submitted by users of an institutional repository, it becomes necessary to make that decision in a more fine-grained way.

A. Soroka
The University of Virginia Library

On Jul 10, 2014, at 4:32 PM, Stuart Yeates <[log in to unmask]> wrote:

> This is conceptually no different to importing a large batch of MARC records and looking at the various note and private fields and deciding which to include and which to exclude.