Thanks for the detailed breakdown. Your response is far clearer than
my original question :) Thanks also to others who sent suggestions and
1&2. Yes, I found it in the isolat2 character set which of course is
referenced in the ead.dtd, and yes we are using the utf-8 encoding as
declared in our EAD files, so we're all good there.
3. This is the nub of the problem for us -- our indexing software
apparently does not fully support utf-8. It appears to be happier with
characters from the isonum and isolat1 character sets, which cover the
vast majority of diacritics that we might need. So we're pondering the
4. This was a "duh!" moment for me. Yep, it can be done manually via
the View Plain Text option and entering (for example) &x014D; Switching
to View Tags then automatically renders it as the o with overbar.
Embarassing that I didn't think of this; ten years ago I was doing *all*
my SGML editing in plain text and woudn't have thought twice about it --
didn't realize how accustomed I had gotten to using the authoring tools,
a lesson for me indeed...
Thanks again -
Michele R. Combs
Librarian for Manuscripts and Archives Processing
Special Collections Research Center
Syracuse University Library
222 Waverly Avenue
Syracuse, NY 13244
>>> [log in to unmask] 7/17/2007 11:37 AM >>>
Perhaps there are four issues here.
1. Is it permitted in EAD? EAD has nothing to say about character
2. Is this character valid in XML? Depends on the character encoding
3. What are the indexing implications? An issue that relates to every
diacrtic character but is an implementation question.
4. Inserting a character reference in your editing software. I assume
that there is a code point in some character set that defines this
character. Seems like you need to associate the two- declaring the
character set in the XML declaration and inserting the character entity
reference in the EAD instance within the standard XML delimiters, the
ampersand and a semi-colon. With character references in XML, one is
certainly not limited to the set of characters that any particular
editor choses to display in its character map. Editors seem to vary in
their ability to render different entities properly within the software.
Soemtimes you see the character as intended for display, sometimes the
entity reference with the code point.