Sigh...I realize my original message was stated in such a way that it was
possible people could construe that I am against all provenance information
everywhere, when nothing could be further from the truth. My bad.

I had been responding to a specific example of the use of provenance in
terms of where the title information was taken for a bibliographic record,
for example:

>- title page title
>- cover title 
>- title from jewel case insert

And the end-user in me shouted ³who cares!² Yes, I understand that there may
be variances in those titles, but I wanted to not make the assumption that
such variance would have a detrimental effect on end-user needs. Not that it
wouldnıt necessarily, but I donıt think we can afford to just assume that it
does. All complexity comes at a price and we should be clear when the
complexity is worth it and when it is not. Weıve been really bad at that in
the past. 

I also understand that there will be cases where machine processes create
metadata, which means we basically get it ³for free², and thatıs great, I
welcome it. But I donıt know of a machine process that would supply the
information above. That means we would be paying a cataloger to record it,
and therefore we need to be REALLY SURE itıs important. Iım not yet
convinced that it is.

Iım not claiming that any metadata we capture is ³necessarily...more
expensive, and that sensible people will immediately see we canıt afford
it.² Far from it. I just donıt want us to continue to assume that we can
afford infinite metadata. We canıt. Therefore, if we canıt, we need to
select wisely what we spend our staff time to capture. For my money, it
isnıt where the title is taken from. However, really all Iım saying is ³show
me the money².  If someone can make the case that not capturing this will
have a significant deleterious effect on the end-user communities we serve,
then great, letıs get it. Otherwise, we can spend our time more effectively
doing something else.

Keep in mind that the effect on our user communities can even be something
like ³if we didnıt have this data it would make our work so difficult as to
prevent us from spending time doing other things our community wishes we
could do for them.² In other words, Iım not writing off backroom
efficiencies as others have inferred.

Diane, if as you say, ³all the provenance I'm talking about is supplied for
machines, by machines² then Iım pretty much all for it, so long as the
carrier of it does not need to be so complex as to render all kinds of
difficulties down the line (but that should be rare). In other words, you
seem to think we are nearly diametrically opposed, but I sure donıt think
so. At least not how you have laid out your position below.

On 1/14/12 1/14/12 € 1:47 PM, "Diane Hillmann" <[log in to unmask]>

> Roy: 
> Lucky you! You've stumbled on a topic I really feel strongly about (yeah,
> there are others).  So there are comments below: 
> On Fri, Jan 13, 2012 at 3:43 PM, Roy Tennant <[log in to unmask]> wrote:
>> Diane asks the question ³We want to do this well, donıt we?² My reply would
>> be we should want to do it as well as is required to support real end-user
>> needs that are important to support. This is because we will clearly lack the
>> level of resourcing we enjoyed for much of the 80s and 90s, and even into the
>> 2000s. We must choose well where to put our resources or we will regret it.
>> Lacking any context, any cataloger will want to describe a resource to within
>> an inch of its life. But that isnıt what we can afford to do.
> I'm frustrated by the continuing assumption that by suggesting the high value
> for provenance, we're proposing something that will necessarily be more
> expensive, and that sensible people will immediately see that we can't afford
> it.  Certainly this is true in our current environment, but in a world where
> data will be moving around in very different ways than we see now, and not in
> MARC-like aggregation, provenance data is essential.  
>> So Iım suggesting we need to provide the end-user use cases where knowing
>> ³where it came from, when it was last updated, how it was created (human or
>> machine?)² is important and then we can go from there. This can be something
>> along the lines of ³without that information I canıt provide the user with a
>> display from which they can make intelligent decisions about the resource
>> because of X and Y². But there must be something to justify all the work
>> besides our deep-seated (and laudible) desire to do things ³well².
> If we define 'end users' as always being human, we're missing a whole lot of
> the point of all this shift in focus. If we're expecting machines to parse,
> manage, and interpret the data coming at them, we have to see them (and the
> services that depend on them) as 'end users' as well. Yes, as always, humans
> will be directing all this, but we need to provide much more information about
> the data itself if we're expecting all this to work in a different
> environment, AND to be affordable and efficient.  We should have learned well
> enough in the last forty years about how insufficient and sometimes lousy data
> limits what we can do. 
> Keep in mind that all the provenance I'm talking about is supplied for
> machines, by machines (but in a manner designed by people). Expensive humans
> aren't entering data on forms, but they need to know how to instruct the
> providing and consuming machines what to supply for downstream services, and
> how to interpret what they're being fed. This is not rocket science, and there
> are people in our community and others who 'get' this and have even written
> about it. There are even vendors who are actually using these ideas
> successfully--one example is the Summon product. 
> Let's keep our minds open ...
> Diane
> Diane