Actually, I was thinking more of page images, trying to look at the two kinds of data. I was viewing transcribed data as serving the function of "Is this what you were looking for?" in which case, as Robert points out, transcription is inexact, and a page image would be more faithful. In a world where keyword searching is the default mode for most of us, I see structured access points, etc.--the other kind of data--as means of slicing and dicing the result set and triggering related-entity searches. The whole text would indeed be present in any contemporary e-text file--and even as imperfect OCR in digitized older resources--to facilitate keyword searching, but I wasn't thinking of any accompanying metadata. I wanted to try to look at the question purely in terms of the two kinds of data--three, if one includes the jumble of extracted text--and ask whether, if the purpose of the transcribed sort is really to answer this question-- "Is this what you were looking for?"--whether a page image or two serves the purpose better.
Sent from my iPad
On Sep 14, 2011, at 4:38 PM, "Mark Ehlert" <[log in to unmask]> wrote:
> J. McRee Elrod <[log in to unmask]> wrote:
>> Ed Jones <[log in to unmask]> wrote:
>>> Would transcription still be necessary if a title page (or analogous
>>> source for other types of resource) image were routinely included ...
>> We include "thumbnails" of cover images for a major client (30,000
>> records so far). But they are images, and can not be keyword
> Ed's not referring to keyword searching an image with text. He's
> referring to, say, an ePub or PDF file of text (title page or whole
> work) within the coding of which is metadata that can be searched on
> or extracted and put into a database. You might be familiar with EXIF
> and image metadata, which is somewhat similar.
> Mark K. Ehlert Minitex
> Coordinator University of Minnesota
> Bibliographic & Technical 15 Andersen Library
> Services (BATS) Unit 222 21st Avenue South
> Phone: 612-624-0805 Minneapolis, MN 55455-0439