Print

Print


Thanks to Liz for being the person to finally put her neck forward and
say these things--and to Bill Landis, Jim Cross and others for getting it
started!

Clay Redding at AIP has established a message board where anyone
with programming skills who is interested in contributing to such a
project as Liz outlines--as I am--can get more information.  I've been in
touch with Clay and he will be posting the address soon.

Any takers on this project???

I personally think something along the lines of an "EAD Lite" would do the
profession a tremendous service.  As some of you know I beat the drum for
this at SAA, but it fits in well with Liz's suggestions.  It could be a
limited subset of the master DTD.  If it were farily constrained along
the lines which Liz outlines, it would work hand in glove with an
application or applications specifically tailored to do EAD from markup to
publishing to searching.

Why should each institution have to reinvent the wheel for every
descriptive project?  The DTD is _way_ too complex for most uses and
actually impedes a reasonable level of ease of use and interoperability.
(Just as an aside, I am working on an OAI project, and it will be very
difficult to extract decent metadata due to the types of "tag abuse" Liz
mentioned.  This takes place even at the top level of the finding aid.
And even without tag abuse, it will be difficult to return good results
since people encode dates and names differently, among other problems.
As Kris Keisling has pointed out, EAD provies no shortage of metadata,
but it is extremely difficult to find the "hook" to actually catch it.

Chris Prom
Assistant University Archivist
University of Illinois Archives
Room 19 Library (MC-522)
1408 W. Gregory Drive
Urbana, IL 61801

web:    http://www.library.uiuc.edu/ahx/
e-mail: [log in to unmask]
phone:  217 333 0798
fax:    217 333 2868

On Mon, 17 Sep 2001, Elizabeth Shaw wrote:

> As I watch the traffic on the listserv regarding XSL and the EAD
> cookbook, I am increasingly concerned that we are losing sight of what
> EAD could provide to the broader archival community as well as
> individual repositories.
>
> Before I start ranting about anything I would like to say that Michael
> provided a marvelous starting point with the EAD cookbook, giving
> assistance to get over a technological hurdle. I doubt he would disagree
> if I say that his work is a beginning and not the end.
>
> Ideally, EAD should be a means to provide structural and semantic
> markup of archival description. It has always concerned me that it is
> a loosely structured set of markup, trying to accomodate everyone's
> idea of the way description should be *presented*. And even as there
> are descriptions of what belongs within which tags, the intrepretation
> across archival repositories varies. As a technologist, whose role it
> has been to manipulate markup, EAD's highly lax structure has made it
> more difficult to mine what could be a very rich descriptive
> structure. In fact, I would argue that its laxness has actually
> confounded people's ability to modify it to their own descriptive needs
> by inhibiting the very commonalities that it was developed to promote.
> Whatever you care to say about MARC/AACR2, you know what
> you are getting when you retrieve the 245 field.
>
> I don't believe that the archival community will ever be able to fully
> capitalize on the power of SGML/XML unless it can come to some more common
> and broadly held understandings of the nature of archival description. No
> matter how much markup is inserted into a descriptive document, the
> potential to fully exploit the markup will be limited without that common
> understanding. In addition, in some of the discussions of description to
> which I have been privy, I have heard a lack of distinction between what
> is commonly held to be the important elements of description and their
> final *presentation*. This has led to unfruitful arguments about
> description. It often seems from the perspective of this programmer that
> some of the arguments that led to a lax DTD have really been about what
> the presentation product (ie formatting) looks like rather than
> fundamental descriptive practice. You can make anything look like a
> "table" using XSLT - it may be more useful to capture the "meaning" of the
> information rather than its format. Then it can be shared across
> repositories.
>
> One of the most difficult hurdles in understanding the power of using
> XML as a document markup tool is that we can largely separate content
> from presentation/formatting. It was certainly a hurdle for
> me. Absorbing the idea that I could take information that was ordered
> one way in a document and rearrange it for a variety of displays, that
> I needn't worry about what was bold or italicized (that I should
> instead worry what the information was "about"), took a while.
>
>
> With the advent of numerous tools such as XSL(T) to manipulate XML,
> some of the laxness of the DTD that was built in to accomodate widely
> varyiny *formatting* practices is now irrelevant. From a single source
> document one can generate multiple versions of a finding aid. Indeed,
> one can rearrange the information contained within an EAD document in
> any order including putting the eadheader information at the very
> bottom of the document if one so desires. Allowing a loose structure
> actually confounds our ability to share documents across
> repositories. And without certain structural and markup commonalities
> it is more difficult to build commonly shared processing tools,
> including things such as stylesheets because of the infinite
> variations of the original documents. Were the descriptive and markup
> practices more constrained, building these tools with good user
> interfaces would be greatly simplified - therby obviating the need for
> every archivist to learn the ins and outs of XSLT.
>
> With the development of manipulative tools, we could accomodate vastly
> different presentation styles (if we desire that) while sharing a
> common, consistent descriptive and encoding practice. Common encoding
> and description would also allow us to build search tools that can
> take full advantage of the rich information contained within finding
> aids across collections.
>
> On the other hand, this leads me to another observation. With increasing
> concern I have seen people writing their finding aids to accomodate
> Michael's stylesheets because they don't have the ability to modify
> them. I doubt that was his intent. And in fact, in at least one query
> that I have seen, it has led to what is called, in other SGML/XML
> communities, "tag abuse". This is the inappropriate use of
> tags(elements) in order to meet formatting or stylistic needs rather
> than encoding the meaning/semantics/structure of the document. If
> people start encoding their container lists so that they will look
> nice when using the cookbook's stylesheets, they have missed one o of
> the most important opportunities of encoding the finding aids in EAD
> in the first place - that is to reflect the intellectual structure and
> hierarchy of the collection. If one's only purpose is to make a "good
> looking" finding aid for the web, one might as well skip the arduous
> process of encoding it in EAD and encode it in HTML.
>
> But clearly this misses the opportunity of EAD. XML can allow us to
> share description across collections. But it can also allow us, in
> individual repositories, to create single source documents, which,
> through manipulations such as an XSLT transformation to HTML (and
> XSL/FO to PDF), can provide multiple views of the the same
> information.
>
> Indeed, were we to agree on some common descriptive/encoding practices we
> could build EAD specific tools, shared across repositories that would
> enable us to automatically generate MARC records, reading room
> versions of finding aids and a variety of other versions. These tools
> would simplify the management of description rather than make it more
> onerous. I currently see archives reproducing their their descriptive
> information in a variety of forms.
>
> Indeed, I would argue that what the archival community should focus on
> is developing a common markup practice based on a common rich
> descriptive practice.  If repositories hold a common understanding of
> the content of the elements and could agree on a common markup
> practice the machine manipulation of the documents would be greatly
> simplified -indeed almost trivial. Tools that can be adapted, rather
> than blindly implemented would be easier to build on a common set of
> markup practices. Each repository could display that information in
> its own unique way but rely on the common tools for things such as
> MARC transformations, searching across collections of finding aids,
> and to provide adaptable templates for display.
>
>
> I take to heart Bill's concern that we really don't understand what
> information is useful to our users. However, I would argue the
> opposite - that XSLT and other XML manipulation tools provide an
> incredible opportunity to discover precisely what we do not know about
> users. A good user study might take a richly encoded description of
> collections and display the same information in a variety of ways. An
> analysis of what patrons find most useful would lead to a better
> understanding of descriptive practice and presentation of
> information. So, in fact, XSL provides a wonderful opportunity in this
> arena.
>
> Finally, as someone who has worked with SGML/XML for several years on
> the programming end of things and someone who has trained many folks
> to encode finding aids, I have long been interested in building a
> suite of tools that would be EAD specific. They would make things such
> as creating and editing EAD instances and modifying XSLT stylesheets
> and XSL/FO more transparent and simpler for archivists who need to
> focus on describing collections rather than encoding their
> decriptions. I am not convinced that every archivist needs to
> understand all the complexities of encoding documents in hte longer
> term. Dynamic web forms, GUI interfaces could be created that would
> enable the simplification of the process. Any effort to do this at
> this point will be respository specific because consistent encoding
> practices are needed in order to simply build such tools. There is not
> doubt that to effectively share tools across repositories would
> require that some idiosyncratic descriptive practices be retired. But
> that does not mean that we have to give up on idiosyncratic display
> and presentation!
>
> I, and others who have been thinking about these issues, have
> hesitated. We can build tools that meet our institutions' practices
> but they will be of little use to the larger community, if our own
> practices are idiocyncratic. And they require significant effort. The
> payoff would be much greater to everyone if we were assured that our
> tools would not be built on shifting sands. Building such tools would
> be significantly easier if the infinite possibilities presented in EAD
> were constrained. A series of easily adaptable tools would mean that
> fewer would have to resort to the "tag abuse" to fit the cookbook
> stylesheets. They would have their own "GUI" tools to easily modify
> the display.  I am not convince that a stricter use of the DTD would
> would significantly reduce an individual repositiory's ability to use
> EAD to represent the vast majority of its requirements.
>
> I personally am excited about the ability to use things like XSL(T)
> combined with other tools to:
>
>        - automatically generate MARC records in MARC communications
>        format for automated insertion into online catalogs
>        - create PDF versions of documents for reading rooms
>        - gain a greater understanding of our users information needs by
>        providing alternate views of the information as a part of user
> studies
>        - provide rich targetted cross colleciton searching for our end
>        users
>        - enhance the tag set to include collection management
>        information to enable implementation of a real single
>        source/multiple use document management system for archival
>        respositories..
>
> XML can be an extremely powerful tool. If all we ever expect to do
> with it is mount finding aids in HTML on the web, we are truly missing
> some marvelous opportunities.
>
> Finally, I would like to add that learning XSL may at first seem
> complex but if you are interested in capitalizing on potential of XML
> then it is worth learning. In fact, I would argue that it can help all
> archivists to truly understand the distinctions between content and format
> about which I have been ranting. That can only help us to develop a
> common understanding of the potentials and limitations of EAD in this
> arena.
>
> Liz Shaw
> Lecturer
> School of Information Sciences
> University of Pittsburgh
>