I think the biggest crux of the problem is the eventual replacement for MARC--if that is even absolutely necessary.
Working in the IT environment, and seeing different types of data, have actually given me an appreciation for the MARC record
structure as a data exchange format. (Notice I didn't say User format).
I'm currently working with about 6 different data storage types and getting the data between different systems has been a real
problem. Everyone immediately says "oh, this is in an XML format". Let me tell you, XML is a markup language only, and not
a real storage or data exchange format. The problem is that I'm spending considerable amount of time writing mapping software
just to load and manipulate data. Our vendors (like SCT Sungard, Novell, Cisco, Microsoft) are anything but sympathetic.
We've lost lots of minute parts of data that has no where to go, and attempting to remap is giving me rethought on a MARC replacement.
We should remember that Henrietta Avram headed the MARC Pilot project in the 1960s and she was a programmer. But more important,
this was not commercial software, or commercial standards, but custom software for LC and LC's standards developed as the format developed.
I wonder today what our programmers would come up with as a replacement. Our IT folks on our campus were comparing our current
storage format (MARC21) with our SCT Sungard Banner system and quite frankly said they were 'envious'. The folks that wrote
Banner are curious too since we load our Patron data in a MARC format--and actually can store it this way. I'll let you know how the meeting
goes since Banner wants to look at this thing called 'MARC'.
I wouldn't be so quick just to throw it [MARC] out before having a format in place that surpasses the usability of the MARC data exchange
format. That would include compact storage to exchange the data, a user interface that is easy to work with and not cumbersome like XML markup,
vendors that will adopt the format **before** implementing the new standard (vendors like III, Ex Libraris, VTLS, OCLC, etc.)
The cost of changing over to new format may prove to be too prohibitive for most. The commercial ILS vendor are just not going to re-write
their systems and give it to the customer free. This could be one of the most expensive things that the library community will undertake, more
expensive than any migration form AACR2 to RDA could ever be. If I were an automation vendor, I'd probably see this as a goldmine--"you
have to buy my new ILS I'm selling for $ 3 million if you want to get off MARC". That's what I would be thinking as a business person.
(Oh Boy, I need to change careers and get ready to make lots of money!)
So, when will this post-MARC environment happen? With the economy in the dumps, and my institution in a financial crisis (who isn't)
it may be some time before this all happens--we all may be retired.
I would say, if the replacement isn't as robust as MARC, then it is doomed to fail.
I'd like to propose that an examination of the MARC format be looked at again--could it be expanded? (The LDR and Directory lead one to believe so)
Who said we can't have over 999 tag numbers (expand it to 4 numbers. If the storage size is limited to 9,999 characters, expand it in the LDR.
What about subfield codes? Redefine it to three characters--combined subfield codes. This format is precise, compact and completely expandable.
Don't go throwing out the baby with the bath water.
Jeff Trimble
On Sep 15, 2011, at 11:47 AM, Karen Coyle wrote:
> What I haven't seen discussed here is the frequency with which this data is needed. When I post about making place of publication an actual place data element, I'm told there is rarely a need for it. How often is a precise comparison of title pages of essence? Is it worth making copies of all title pages for that number of instances? Does this apply to all works, or is there a niche where this has more use than, say, currently published trade books?
>
> What this comes down to is a need to look at all of our data practices and ask ourselves:
>
> - who needs this?
> - what is the context in which they need it?
> - how often is it used?
> - is there a more efficient way to provide this information?
> - is there a better way to achieve this goal?
>
> If I were being asked to create a new metadata scheme for Widgets, Inc., those are among the many questions I would ask of the providers and users of the information.
>
> One of the big difficulties that I see for this effort is that most of us come to the task with deeply ingrained practices and assumptions. We won't go very far forward if we can't re-visit all of these and decide what REALLY is needed today. This is why I recommend that there be some non-librarian IT folks consulted. As I said in a blog post, in fact that is exactly how MARC was developed - by an IT person who (fortunately!) was very good at listening to librarians.
>
> http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html
>
> kc
>
>
> Quoting Ed Jones <[log in to unmask]>:
>
>> Actually, I was thinking more of page images, trying to look at the two kinds of data. I was viewing transcribed data as serving the function of "Is this what you were looking for?" in which case, as Robert points out, transcription is inexact, and a page image would be more faithful. In a world where keyword searching is the default mode for most of us, I see structured access points, etc.--the other kind of data--as means of slicing and dicing the result set and triggering related-entity searches. The whole text would indeed be present in any contemporary e-text file--and even as imperfect OCR in digitized older resources--to facilitate keyword searching, but I wasn't thinking of any accompanying metadata. I wanted to try to look at the question purely in terms of the two kinds of data--three, if one includes the jumble of extracted text--and ask whether, if the purpose of the transcribed sort is really to answer this question-- "Is this what you were looking for?"--whether a page image or two serves the purpose better.
>>
>> Ed
>>
>> Sent from my iPad
>>
>> On Sep 14, 2011, at 4:38 PM, "Mark Ehlert" <[log in to unmask]> wrote:
>>
>>> J. McRee Elrod <[log in to unmask]> wrote:
>>>> Ed Jones <[log in to unmask]> wrote:
>>>>
>>>>> Would transcription still be necessary if a title page (or analogous
>>>>> source for other types of resource) image were routinely included ...
>>>>
>>>> We include "thumbnails" of cover images for a major client (30,000
>>>> records so far). But they are images, and can not be keyword
>>>> searched.
>>>
>>> Ed's not referring to keyword searching an image with text. He's
>>> referring to, say, an ePub or PDF file of text (title page or whole
>>> work) within the coding of which is metadata that can be searched on
>>> or extracted and put into a database. You might be familiar with EXIF
>>> and image metadata, which is somewhat similar.
>>>
>>> --
>>> Mark K. Ehlert Minitex
>>> Coordinator University of Minnesota
>>> Bibliographic & Technical 15 Andersen Library
>>> Services (BATS) Unit 222 21st Avenue South
>>> Phone: 612-624-0805 Minneapolis, MN 55455-0439
>>> <http://www.minitex.umn.edu/>
>>
>
>
>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
Jeffrey Trimble
System LIbrarian
William F. Maag Library
Youngstown State University
330.941.2483 (Office)
[log in to unmask]
http://www.maag.ysu.edu
http://digital.maag.ysu.edu
""For he is the Kwisatz Haderach..."
|