Quoting Jeffrey Trimble <[log in to unmask]>:
> So, let's spend the money and the time to revise the MARC structure
> a little to make sense of things.
Before we start revising data formats, shouldn't we get very clear
about our requirements? If those requirements can be fulfilled with a
revision of MARC, then that's what we should do.
However, I have a feeling that we haven't even begun to define our
requirements (and "just like MARC" isn't a requirements definition). I
did one blog post with some very general concepts , but it needs to
be taken much further.
I've heard some requirements that we can begin to gather up:
- the field names need to be easy to 'universalize' in a
language-neutral way. MARC tags do this nicely, with the downside that
they have to be learned and therefore only work with trained users.
Perhaps we can have it both ways, in a sense the way we do today, with
underlying codes for experts who need to communicate widely with
colleagues, and display forms for the less expert (both users and
- the ability to carry both display text and identifiers for
controlled vocabularies, with the option to have both or either
without disrupting the data format.
- ISO 2709 has been used for MARC21, Unimarc, MAB and other formats.
The next data carrier needs to be at least that flexible.
- there is a need to create relationships between bibliographic items,
such as part/whole relationships. (MAB and other formats already have
I'd like to add:
- that our data needs to play well on the Web
- that it uses data where possible, not text
I'm sure there is a lot more, but we have to be clear on our goals
before we select or modify a data format.
> We've done this before. We had format integration in 1993/1994.
> And Henrietta Avram even admitted that her biggest mistake was
> creating different formats and then the authority format. If she
> had it all over to do again, she would have created the Authority
> MARC format and then a Bibliographic MARC format.
> Now to the limitations. I herewith make a proposal, and I should
> even be sold bold to say that this is something we need to take to
> MARBI. As for ISO 2709, let's change it, don't let it box you in.
> I propose:
> 1. Record Length. We'll need to adjust the Leader positions of
> 00-04, and move it to something much higher. Perhaps push bytes
> 05-23 further out. So we can reserve bytes
> 00-12 for record length (and bytes 05-23 become byes 17-31) That
> give you up to 9.999999 TB. That's one hell of a record. Do you
> think you have enough content for that large of a record? You can
> now include the actual printed book.
> 2. Expand the MARC record to have a 4 character numeric tag,
> starting with 0001 and continue to 9999. That too is quite big,
> many fields repeated, and more fields to define. Oh boy can we
> define fields.
> 3. Indicator count. Again, expand it to 3. We may not use it, but
> let's get rolling.
> 4. Subfield code count. Again, expand it to 3. You can then tell
> the computer that after the "delimiter" ($), you have either a 1 or
> 2 byte subfield. I can see us using $aa $ab $ac (or if you go to 4
> character count you
> could do something like $a-b $d-a or even $a$b $d$a. Or even a
> different delimiter sign as a secondary delimiter.
> So you want more content. I've just answered your question plain
> and simple, with little disruption to the current structure. We can
> easily write conversion programs to deal with current MARC records.
> We did something similar to this back in the 1999/2000 glitch. Most
> mainframes at that time stored only the last digit in the VSAM
> records. What was the answer? Well, spend Trillions of dollars
> and throw out the mainframe
> and buy a unix box. (Unix store all 4 digits of the date (at least
> BSD and AIX). Instead, what most people did was to address the VSAM
> record storage issue and expand it by 2 bytes. This was not at easy
> task but it was
> cheaper than buying new software. (Oh, yes, IBM was happy to sell
> you AIX--Sun told you that you needed to get off the Mainframe--the
> Y2K was going to make you loose your hardware and it wasn't fixable).
> Now what I've proposed is simple, straight forward, and most of our
> ILS vendors and OCLC could do this in a matter of months, maybe a
> year extra. We've just bought ourselves several decades of time
> until technology is so advance we don't even need to worry about the
> printed word.
> I'm no luddite, but in my experience as a programmer, MARC works,
> xml is just crap. Every time I have to deal with it, I start
> charging customers more (in this case, I start to whine a lot at my
> place of business).
> Institutional Repositories have been using XML with limited success.
> In fact, DSpace software now allows you to contribute using an
> Excel spreadsheet because the XML coding is so difficult for the end
> We've stopped using XML here at YSU for DSpace contribution. It's
> excel and then to Postgres. I'm finishing up a daemon to take an
> OCLC export and send it over to DSpace--directly to the postgres
> database, skipping the XML apart. Much simpler, and less work and
> our staff are much happier.
> I'm not trying to derail LC and it's move, what I'm saying is think
> long and hard. This is a very expensive move and RDA will seem like
> peanuts--and we already know how much its is disliked by many in our
> Finally, I have to remind us all that we aren't even using all of
> the current MARC features, and we want to replace it. How do you
> know it needs to be replaced when you haven't even scratched the
> surface of seeing if we can enlarge it, restructure it, change it
> up. It was originally a communications format, not an end user
> input format. That said, I can't wait to see some poor cataloger
> given a blank OCLC screen to input and original and type in xml
> coding. Directors will really want to get rid of catalogers because
> this is really kludgy.
> I'm really glad we are having this conversation. It is long
> overdue. We need to continue the dialogue, with respect, and we
> need to begin asking the simple question "How do we know if MARC is
> dying if we haven't attempted to push it further?" And right now, I
> haven't seen a serious push to restructure and expand it.
> Best wishes in programming.
> --Jeff Trimble
>> I think we should pay less attention to the physical format of our
>> data and more to the CONTENT. I've been working on an analysis of
>> MARC content   for a while as a kind of hobby. If we define
>> our content clearly, then we can choose a serialization (or two or
>> three) that simply carries our data, it doesn't define its
>> structure nor would it limit its growth.
>>  MARC as Data: A start. Code4lib journal.
>>  Futurelib wiki. MARC analysis.
> Jeffrey Trimble
> System LIbrarian
> William F. Maag Library
> Youngstown State University
> 330.941.2483 (Office)
> [log in to unmask]
> ""For he is the Kwisatz Haderach..."
[log in to unmask] http://kcoyle.net