Print

Print


Thanks for the clarifications guys!

On Mon, 14 Jul 2003 12:29:03 -0400, "Jerome McDonough"
<[log in to unmask]> said:

<snip>

> So, you can map MODS losslessly into an RDBMS, but if the
> main point of the exercise is to make it easy to search, you shouldn't
> bother; use an XML-base text retrieval engine.  If that's *not* the main point,
> and you're more concerned about data management, than an RDBMS may
> be the way to go.

<snip>

> Given all of the above, and your description of your project, I'd
> seriously consider just keeping the bibliographic data your storing as a series
> of XML records in a single file.

I explained this in a private follow-up to Rick, but am thinking I should
probably do so here in case anyone might be interested.

I have big plans that involve replacing BibTeX and the binary data and
formatting files produced by commercial applications like Endnote with
XML data files and XSLT formatting files.  In others words, I want the
power of XML and of MODS available not just to librarians and such, but
to end users like me (a social scientist who does a fair bit of
non-standard citing for academic articles and books), to college
students, etc.

The most evolved project (well, really the only one!) that deals with
these needs using XML and XSLT is refdb:

http://refdb.sourceforge.net

It uses RDBMSs for storage, and currently supports MySQL, PostgreSQL, and
SQLite.  This itself offers some nice flexibility.  It allows multi-user
support in large organizations if they should need it, while also
allowing for easy-to-use and administer embedded options as well.
Because of the client-server design, it also in theory allows other
projects to use refdb as their engine.

At the moment, the Open Office project has a bibliographic subproject.
That project is currently in the design stages, and one participant (who
is now on this list) has been working on a data model for it.  I have
argued that he really needs to look at MODS, but we're a little unsure
about how to go about supporting it (I myself know nothing about DB
design).

What I want to see is that project adopt refdb as its data storage and
formatting engine.  I also want to see the refdb data model and the xslt
files that format these records enhanced to use MODS (Or something very
close to MODS) records. Doing so will help create an open source
foundation to significantly enhance bibliographic support for students
and researchers.

What you all are saying is that there are tradeoffs here.  So which is
the least painful choice (in terms of development work needed -- because
there are very few contributers to these projects, me included, who can
actually program -- but also features, and user experience)?  I am
confident MODS + XSLT is the solution to some real problems, but the
question of data storage is still open.

I do think it's important for whatever next-generation bib tools to be
able to support a few thousand records gracefully, and to have an option
for multi-user access at some point at least.

So does your answer change given this?  Or should we be looking at XML
databases?  I also read somewhere there are a few Java tools that
simplify dealing with XML data in the context of RDBMSs, but I know
nothing more.

If anyone has any further advice (or wants to help in some other way),
please keep it coming.

Thanks,
Bruce


--
http://www.fastmail.fm - Sent 0.000002 seconds ago