Hopefully, some food for thought.
The word "architecture" is omnipresent in the information technology
literature. However, the "Working Group for the Future of Bibliographic
Control" avoids it and chooses "framework" for its lead term - maybe
for some reason, maybe not. Of course, they are right in saying that
"an environment rather than a 'format'" is needed, and "environment"
may be seen as a more suitable metaphor than "building".
The paper then lists some "requirements for beginning the transition
to a 'new bibliographic framework'". That's all nice and well.
What now? Go ahead and think hard how to satisfy this requirement,
then that one and then the next one?
When you have to build something new, from the ground up and meant
to be sustainable, you need a principled approach. Of course, you
want to keep all your requirements in mind, but not rush ahead and
do something about this or that one at once.
Well, whatever else we think about it, what's needed is a new
information architecture for libraries and other metadata producing
communities in the web.
As every architect learns, good architecture has to resolve a conflict
between three goals: (1)
(and I add a few pertinent subtopics to each)
Durability: Robustness against the elements and heavy use
Security: Protection against attacks and abuse
Extendability: Scaling up without coming down
Cover all usage requirements
Form supports function: Minimize effort of the user
Flexibility: Openness for future requirements and new options
Elegance [the hardest part]
Appropriateness: As simple as possible, but not simpler (Einstein)
Pleasing appearance: Well-proportioned arrangement
Visionary design: Create something people can fall in love with
And then, today as always, there's an overarching need for economy
in all of this: No architect has unlimited means at his disposal.
Questions to be answered early on in the design stage:
1. For whom will it be home? That is, in this case,
What is it the building/framework will have to accomodate?
Topics: Objects, Data model, Data elements
2. What will the dwellers want to do? That is,
What functions will the building have to support?
Topics: Methods, Services (incl. housekeeping!)
3. What relations are there with the rest of the world?
How will it fit in with what's there and what we can't change?
Topics: External standards, Communication (Interfaces)
4. What materials and components are we going to use?
Standard elements, non-proprietary modules, encodings
Topics: Building blocks, modules, textures, surfaces
About the Data model:
Evaluate the status quo: Which are the most frequent data
elements, the indispensable ones; which are not needed (used
in applications like OPACs) or very rarely, or redundant.
Which are insonsistent, unreliable, not machine-actionable
and therefore of questionable value in the legacy?
Full-fledged MARC21 with all its 11,008 elements will not be
called for in many tasks... (see http://marc21rdf.info/)
About the Interfaces:
What are the most frequently asked question types, and what
would the answers have to look like?
Simple tasks must be easy to accomplish.
Based on this, a few suggestions:
Instead of a top-down, maximalist approach that tries to start with
establishing a most comprehensive and detailed element set, may we not
try it bottom-up and start with those elements that the evaluation
reveals to be the most frequent ones? And then proceed in stages.
Other than real buildings, systems can be taken apart and rearranged.
Nonetheless, a bottom-up approach can run into a dead end.
What about the scenarios?
Might we not, instead of pursuing the full Scenario 1 with its
problematic 3-tier record structure, define a simple extension of
the conventional manifestation-based record, using a strucured
modification of the uniform title field? We might have subfields
$E for expression and $M for manifestation, both with standardized
content that would make it easy to arrange records by expression
and expressions by manifestation.
A simple web service
Let's imagine a web service that can answer a few elementary
questions and deliver data that are easy to handle.
We have to realize that outside libraries there is nothing that
could be called a standard for bibliographic data. Situations
that require bibliographic data are very diverse and hardly any
will need a web service response with all the frills of a
Here's an example service for simple tasks:
based on the "liberated" data of the British National Bibliography
(1950 - mid2010, some 3 million records,
There are two variable parts following the question mark:
Part 1: findcmd = Find-Command
isb=ISBN, e.g., isb=978-1-443-82375-3
BNB=BNB-Number, e.g., bnb=B052781
explain : Get a list of options
Part 2: fields = Fields or format wanted
Either a format:
marc [default if sho=... omitted]
Or a comma separated list of field names:
(each of these keywords may be truncated down to the first three
characters, case doesn't matter: Aut=aut=AUT)
Main title [MARC 245$a]
Creator [personal or corp.]
Date of publication [as recorded]
Year [4 digits]
Try, for example,
ISBN may be input 10 or 13 digits, with or without hyphens, with or
without check digit.
Instead of &sho=Tit,Auth,Dat,Subj use &sho=marc
or one of the other formats.
For non-existent ISBNs or BNBs, the response is "000"
You may use cURL or gethttp to get the service results outside a
browser, ready to use in various applications:
This service can be extended to allow more complex search commands
and the handling of result sets rather than single records. It may,
unbeknownst to the user, combine results from more than one database.
(I'm aware the theoretical potential of Z39.50 is much higher, but we
have to ask why this potential has been exploited so poorly.
(1) For example, according to the advice of Vitruvius. His original
terms were firmitas, utilitas, venustas (chp. I.3,2):