Print

Print


Hopefully, some food for thought.

The word "architecture" is omnipresent in the information technology
literature. However, the "Working Group for the Future of Bibliographic
Control" avoids it and chooses "framework"  for its lead term - maybe
for some reason, maybe not. Of course, they are right in saying that
"an environment rather than a 'format'" is needed, and "environment"
may be seen as a more suitable metaphor than "building".
The paper then lists some "requirements for beginning the transition
to a 'new bibliographic framework'". That's all nice and well.
What now? Go ahead and think hard how to satisfy this requirement,
then that one and then the next one?
When you have to build something new, from the ground up and meant
to be sustainable, you need a principled approach. Of course, you
want to keep all your requirements in mind, but not rush ahead and
do something about this or that one at once.

Well, whatever else we think about it, what's needed is a new
information architecture for libraries and other metadata producing
communities in the web.

As every architect learns, good architecture has to resolve a conflict
between three goals: (1)
(and I add a few pertinent subtopics to each)

Stability
   Durability: Robustness against the elements and heavy use
   Security: Protection against attacks and abuse
   Extendability: Scaling up without coming down

Usefulness
   Cover all usage requirements
   Form supports function: Minimize effort of the user
   Flexibility: Openness for future requirements and new options

Elegance [the hardest part]
   Appropriateness: As simple as possible, but not simpler (Einstein)
   Pleasing appearance: Well-proportioned arrangement
   Visionary design: Create something people can fall in love with

And then, today as always, there's an overarching need for economy
in all of this: No architect has unlimited means at his disposal.


Questions to be answered early on in the design stage:

1. For whom will it be home? That is, in this case,
    What is it the building/framework will have to accomodate?
       Topics: Objects, Data model, Data elements

2. What will the dwellers want to do? That is,
    What functions will the building have to support?
       Topics: Methods, Services (incl. housekeeping!)

3. What relations are there with the rest of the world?
    How will it fit in with what's there and what we can't change?
       Topics: External standards, Communication (Interfaces)

4. What materials and components are we going to use?
    Standard elements, non-proprietary modules, encodings
       Topics: Building blocks, modules, textures, surfaces

Related considerations
About the Data model:
   Evaluate the status quo: Which are the most frequent data
   elements, the indispensable ones; which are not needed (used
   in applications like OPACs) or very rarely, or redundant.
   Which are insonsistent, unreliable, not machine-actionable
   and therefore of questionable value in the legacy?
   Full-fledged MARC21 with all its 11,008 elements will not be
   called for in many tasks...  (see  http://marc21rdf.info/)

About the Interfaces:
   What are the most frequently asked question types, and what
   would the answers have to look like?
   Simple tasks must be easy to accomplish.

Based on this, a few suggestions:

Instead of a top-down, maximalist approach that tries to start with
establishing a most comprehensive and detailed element set, may we not
try it bottom-up and start with those elements that the evaluation
reveals to be the most frequent ones? And then proceed in stages.
Other than real buildings, systems can be taken apart and rearranged.
Nonetheless, a bottom-up approach can run into a dead end.

What about the scenarios?
Might we not, instead of pursuing the full Scenario 1 with its
problematic 3-tier record structure, define a simple extension of
the conventional manifestation-based record, using a strucured
modification of the uniform title field? We might have subfields
$E for expression and $M for manifestation, both with standardized
content that would make it easy to arrange records by expression
and expressions by manifestation.

A simple web service
--------------------
Let's imagine a web service that can answer a few elementary
questions and deliver data that are easy to handle.
We have to realize that outside libraries there is nothing that
could be called a standard for bibliographic data. Situations
that require bibliographic data are very diverse and hardly any
will need a web service response with all the frills of a
MARC21 record.

Here's an example service for simple tasks:
   http://www.allegro-c.de/db/bl/?findcmd&sho=fields
based on the "liberated" data of the British National Bibliography
(1950 - mid2010, some 3 million records,
  http://www.bl.uk/bibliographic/datafree.html)
There are two variable parts following the question mark:

Part 1: findcmd = Find-Command
  isb=ISBN,       e.g., isb=978-1-443-82375-3
  BNB=BNB-Number, e.g., bnb=B052781
  explain : Get a list of options

Part 2: fields = Fields or format wanted
  Either a format:
    marc  [default if sho=... omitted]
    isbd
    dc
    endnote
    tagged
    bibtex
  Or a comma separated list of field names:
    (each of these keywords may be truncated down to the first three
     characters, case doesn't matter: Aut=aut=AUT)
    Title
    Main title   [MARC 245$a]
    Author    [personal]
    Creator   [personal or corp.]
    Publisher
    Date of publication  [as recorded]
    Year    [4 digits]
    Imprint
    Series
    Extent
    Notes
    ISBN
    Dewey
    Subject

  Try, for example,
 
http://www.allegro-c.de/db/bl/?isb=978-3-593-38817-5&sho=Tit,Year,Auth,Impr,Subj 


ISBN may be input 10 or 13 digits, with or without hyphens, with or 
without check digit.

Instead of  &sho=Tit,Auth,Dat,Subj   use  &sho=marc
or one of the other formats.

For non-existent ISBNs or BNBs, the response is "000"

You may use cURL or gethttp to get the service results outside a
browser, ready to use in various applications:
    curl http://allegro-c.de/db/bl/?isb=...&sho=...

This service can be extended to allow more complex search commands
and the handling of result sets rather than single records. It may,
unbeknownst to the user, combine results from more than one database.
(I'm aware the theoretical potential of Z39.50 is much higher, but we
have to ask why this potential has been exploited so poorly.

-----------------------------------------------------------------------
(1) For example, according to the advice of Vitruvius. His original
terms were firmitas, utilitas, venustas (chp. I.3,2):
http://penelope.uchicago.edu/Thayer/E/Roman/Texts/Vitruvius/home.html