What I haven't seen discussed here is the frequency with which this
data is needed. When I post about making place of publication an
actual place data element, I'm told there is rarely a need for it. How
often is a precise comparison of title pages of essence? Is it worth
making copies of all title pages for that number of instances? Does
this apply to all works, or is there a niche where this has more use
than, say, currently published trade books?
What this comes down to is a need to look at all of our data practices
and ask ourselves:
- who needs this?
- what is the context in which they need it?
- how often is it used?
- is there a more efficient way to provide this information?
- is there a better way to achieve this goal?
If I were being asked to create a new metadata scheme for Widgets,
Inc., those are among the many questions I would ask of the providers
and users of the information.
One of the big difficulties that I see for this effort is that most of
us come to the task with deeply ingrained practices and assumptions.
We won't go very far forward if we can't re-visit all of these and
decide what REALLY is needed today. This is why I recommend that there
be some non-librarian IT folks consulted. As I said in a blog post, in
fact that is exactly how MARC was developed - by an IT person who
(fortunately!) was very good at listening to librarians.
http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html
kc
Quoting Ed Jones <[log in to unmask]>:
> Actually, I was thinking more of page images, trying to look at the
> two kinds of data. I was viewing transcribed data as serving the
> function of "Is this what you were looking for?" in which case, as
> Robert points out, transcription is inexact, and a page image would
> be more faithful. In a world where keyword searching is the default
> mode for most of us, I see structured access points, etc.--the other
> kind of data--as means of slicing and dicing the result set and
> triggering related-entity searches. The whole text would indeed be
> present in any contemporary e-text file--and even as imperfect OCR
> in digitized older resources--to facilitate keyword searching, but I
> wasn't thinking of any accompanying metadata. I wanted to try to
> look at the question purely in terms of the two kinds of
> data--three, if one includes the jumble of extracted text--and ask
> whether, if the purpose of the transcribed sort is really to answer
> this question-- "Is this what you were looking for?"--whether a page
> image or two serves the purpose better.
>
> Ed
>
> Sent from my iPad
>
> On Sep 14, 2011, at 4:38 PM, "Mark Ehlert" <[log in to unmask]> wrote:
>
>> J. McRee Elrod <[log in to unmask]> wrote:
>>> Ed Jones <[log in to unmask]> wrote:
>>>
>>>> Would transcription still be necessary if a title page (or analogous
>>>> source for other types of resource) image were routinely included ...
>>>
>>> We include "thumbnails" of cover images for a major client (30,000
>>> records so far). But they are images, and can not be keyword
>>> searched.
>>
>> Ed's not referring to keyword searching an image with text. He's
>> referring to, say, an ePub or PDF file of text (title page or whole
>> work) within the coding of which is metadata that can be searched on
>> or extracted and put into a database. You might be familiar with EXIF
>> and image metadata, which is somewhat similar.
>>
>> --
>> Mark K. Ehlert Minitex
>> Coordinator University of Minnesota
>> Bibliographic & Technical 15 Andersen Library
>> Services (BATS) Unit 222 21st Avenue South
>> Phone: 612-624-0805 Minneapolis, MN 55455-0439
>> <http://www.minitex.umn.edu/>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
|