Good afternoon,
I am puzzled about the question of file size with stylesheets as I have not been able to reproduce the results cited below.
In finding aids with extensive lists of components, the html produced by the Cookbook stylesheets is actually as compact or more so than the source EAD.
Consider this simple example from a finding aid and stylesheet used in the workshops Kris Kiesling and I teach.
This EAD markup
<c02>
<did>
<container type="Box">1</container>
<container type="Folder">3</container>
<unittitle>Gusdorf, Ida,</unittitle>
<unitdate>1942-1955.</unitdate>
</did>
</c02>
Produces this HTML
<tr>
<td>1</td>
<td>3</td>
<td colspan="8">Gusdorf, Ida, 1942-1955.</td>
</tr>
The EAD source is 159 characters; the HTML is 71 characters. In my experience this is probably typical of many finding aids. EAD can verbose.
When transformed, the resulting HTML from the full finding aid is actually marginally smaller than the source EAD document. It's not the use of tables that's driving the file size.
Other things may be contributing to the file size- extraneous whitespace (tabs, line feeds, etc.)
The other issue is that the output from the msxml tranformation engine apears to insert a fill character- 00- between each character in the HTML which doubles the file size in the Windows FAT. But even that only sightly more than doubles the file size. I certainly is not a factor of 4.
I wonder if those extra character could be stripped out? a perl script comes to mind.
There certainly are options to the use of tables if one wants to move away from columnar display of container numbers, locations and component descriptions.
Max Evans experimented with the use relative font sizes rather than indention to show hierarchy. For me, this works so long as one has only 2 or 3 levels of arrangement to convey.
Another approach is move away from the line by line descriptions so common to finding aids where each horizontal line in the display represents a component. Rather than using a columnar display, an indented "boxy" presentation like this visually presents each component, at least to my eye, as a more discrete unit. This display for a c02
Gusdorf, Ida, 1942-1955.
Correspondence with family members, primarily her mother.
Location: Box 1, Folder 3
could be handled with a CCS indent or margin property therby avoiding tables all together.
Michael
________________________________
From: Encoded Archival Description List [[log in to unmask]] On Behalf Of Joyce Chapman [[log in to unmask]]
Sent: Sunday, September 06, 2009 12:18 PM
To: [log in to unmask]
Subject: Re: Revisiting <dsc> output in table format
Ethan, from doing a search here http://ead.lib.virginia.edu/vivaead/, it looks like you (actually all of Virginia Heritage) are not using tables at all and are using <ul> instead. Are you all using <ul> specifically for the reason you mentioned in your post, or did other factors contribute to that decision as well?
Thank you,
Joyce
On Sun, Sep 6, 2009 at 12:20 PM, Ethan Gruber <[log in to unmask]<mailto:[log in to unmask]>> wrote:
The largest flaw in the older nested table approach that we see in EAD cookbook, I think, is the fact that that the HTML version of the finding aid is inflated to 400-500% of the filesize of the source XML file. This is especially evident in larger, complex finding aids that have six or more component levels. A manageable 2MB XML file transformed into a completely unusable 8MB HTML file.
Ethan Gruber
University of Virginia Library
On Sun, Sep 6, 2009 at 12:01 PM, Joyce Chapman <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Probably many of us have been told by Web folk that a <dsc> output using empty table cells as a mechanism to control display (forcing indentation) is bad. Not only does it fail to use tables the way they are meant to be used (only as a mechanism to code tabular data, not for design/layout) but I've been told that all these empty padding cells are unfriendly to screen readers for the blind. In fact, I've been told that if you are funded with state money, you are required to be accessible to screen readers and definitely shouldn't be displaying the <dsc> info this way. I deal with this by outputting the tabular data of the <dsc> in a table with only two columns (and since it IS tabular data, 1-2 container columns and 1 content column are supposedly ok), and controlling indentation of embedded components in the content column with CSS. Which is easier for my brain to deal with than the many-columned approach anyway! So my questions:
1. How many of you are outputting the <dsc> without using a <table> at all?
2. How many of you are using a 2- or 3-column table with CSS to control embedded components?
3. How many of you have been told by your Web peeps that the many-columned approach should not be used?
Joyce
--
Joyce Chapman
NCSU Libraries Fellow
Metadata and Cataloging/
Digital Library Initiatives
[log in to unmask]<mailto:[log in to unmask]>
--
Joyce Chapman
NCSU Libraries Fellow
Metadata and Cataloging/
Digital Library Initiatives
[log in to unmask]<mailto:[log in to unmask]>
|