Lucky you! You've stumbled on a topic I really feel strongly about (yeah, there are others).  So there are comments below: 

On Fri, Jan 13, 2012 at 3:43 PM, Roy Tennant <[log in to unmask]> wrote:
Diane asks the question “We want to do this well, don’t we?” My reply would be we should want to do it as well as is required to support real end-user needs that are important to support. This is because we will clearly lack the level of resourcing we enjoyed for much of the 80s and 90s, and even into the 2000s. We must choose well where to put our resources or we will regret it. Lacking any context, any cataloger will want to describe a resource to within an inch of its life. But that isn’t what we can afford to do.

I'm frustrated by the continuing assumption that by suggesting the high value for provenance, we're proposing something that will necessarily be more expensive, and that sensible people will immediately see that we can't afford it.  Certainly this is true in our current environment, but in a world where data will be moving around in very different ways than we see now, and not in MARC-like aggregation, provenance data is essential.  


So I’m suggesting we need to provide the end-user use cases where knowing “where it came from, when it was last updated, how it was created (human or machine?)” is important and then we can go from there. This can be something along the lines of “without that information I can’t provide the user with a display from which they can make intelligent decisions about the resource because of X and Y”. But there must be something to justify all the work besides our deep-seated (and laudible) desire to do things “well”.

If we define 'end users' as always being human, we're missing a whole lot of the point of all this shift in focus. If we're expecting machines to parse, manage, and interpret the data coming at them, we have to see them (and the services that depend on them) as 'end users' as well. Yes, as always, humans will be directing all this, but we need to provide much more information about the data itself if we're expecting all this to work in a different environment, AND to be affordable and efficient.  We should have learned well enough in the last forty years about how insufficient and sometimes lousy data limits what we can do. 

Keep in mind that all the provenance I'm talking about is supplied for machines, by machines (but in a manner designed by people). Expensive humans aren't entering data on forms, but they need to know how to instruct the providing and consuming machines what to supply for downstream services, and how to interpret what they're being fed. This is not rocket science, and there are people in our community and others who 'get' this and have even written about it. There are even vendors who are actually using these ideas successfully--one example is the Summon product. 

Let's keep our minds open ...