I'm not sure I've ever given a larger perspective on a lot of the
issues I raise on this list, so let me do that now.
Long story, short, then:
I'm not a professional software developer, but rather a scholar who got
so frustrated with using Word and Endnote for my dissertation that I
looked around for alternatives. When I realized there really weren't
adequate ones, I decided to create one. I believe MODS (and perhaps
MADS) will provide an important standard to make this happen.
I'm now the co-project lead for the OpenOffice bibliographic project,
which will provide a dramatically improved bibliographic support in the
most widely used free software suite. My goal is not only a free
alternative to Word and Endnote, but a superior one.
Where MODS fits in: we would like to standardize on MODS as the new
bibliographic representation, not only in OpenOffice the suite, but in
the file format on which it is based: OpenDocument.
OpenDocument is developed by an OASIS Technical Committee that includes
members from a variety of projects and companies, including IBM most
recently. The file format is headed for ISO approval.
Our proposal will be to define MODS as *the* supported bibliographic
format in OpenDocument. So if a user has citations in their document
and saves it, a modsCollection file (probably called
"bibliography.xml") will be embedded in the file wrapper directory. If
that user -- using, say, OpenOffice - sends their document to a
colleague using some other supporting application, the idea is they
would be able to interoperate because the bib format is standardized
too.
This is why I remain worried about two primary issues:
1) Names.
If I cannot include more than one representation of a name in MODS,
then I cannot reasonably support authors who work across languages.
Using the extension element for this is not enough, because
interoperability then becomes much more fragile.
2) Markup within titles and abstracts.
The most basic example are titles-within-titles (which could be
supported with the addition of single element like xhtml:span), but the
following more tricky example seems to me equally important long-term.
>> I'm going to continue lobbying the LoC to allow some kind of
>> additional markup in title fields, but it may be difficult because it
>> can be a can-of-worms (e.g. why not just allow MathML?).
>
> My question is why not support mathml instead of just hand selecting
> some of the features? I understand that it will be harder for you to
> implement but at least that's a well known standard albeit not widely
> spread.
>
> You see, my problem is that where i work professors write papers that
> sometimes contains some mathematics in the title. Usually nothing too
> fancy, for instance (in latex) $H_\infty$. However in the abstracts we
> have some more complicated math in there.
>
> Could you please take that into consideration?
I do realize this is a can-of-worms, but I dread the idea of having to
write and maintain our own schema just to be able to support this
(which has been supported in BibTeX since the beginning; if we won't
have an XML bibligraphic representation that can do this, the Math and
Physics people will never have any interest in it).
One option is perhaps to do something like Atom does, where you define
a few elements that allow additional foreign markup, and indicate what
that markup is with a content attribute so supporting applications know
if they can display it. I suppose it could be embedded in the
extension element if necessary, but it would be something like:
<title content="application/xhtml+mathml">blah, blah, blah</title>
Bruce
|