Given the recent reports of problems using various SGML tools, I thought it
might be useful to post some general guidelines for configuring SGML tools
and generally resolving common problems:
1. Unlike with HTML, which is a (more or less) fixed set of elements,
SGML tools cannot know or divine what the presentation or processing
needs to be for arbitrary document types. For this reason, any SGML
tool that does more than simply parse a document and spit out the
results must be configured anew for each new document type it is to
work with.
This configuration usually takes the form of style sheets or
"output specifications". Today, every tool has its own way of defining
styles (even tools that use the DoD FOSI style mechanism do slightly
different things, although they can be made to work with each other's
FOSIs). This means that every editor and browser will require its own
unique style sheet and that, today, there are few, if any facilities
for converting one tool's style sheet into another tools style sheet.
Even if a style sheet is provided to you for a particular document type
and tool, you will often need to know how that particular tool
associates styles with documents. Every tool has some mechanism--the
challenge is figuring out what it is and how to use. (For example,
for Panorama it's pretty clear, but for Framemaker+SGML it's not at
all obvious.)
This situation will improve some when and if tools start to support the
recently-published DSSSL standard, ISO/IEC 10179:1995 (see www.jclark.com).
Somebody recently posted to comp.text.sgml an announcement of a Perl
script to convert Panorama style sheets to DSSSL specs--this is a step
in the rigth direction.
2. SGML's entity model for managing files means that you may have to do
a bit more work to make references to included files work. The
indirection provided by entities makes SGML more robust generally but
can make it more difficult to work with documents casually. (But note that
other tools have the same problem, they just tend to hide it better--for
example, if you have a Word document that refers to graphics or
subdocuments
by reference, Word stores the whole filename in the document. If you
move the files to other places, Word won't be able to find the files.)
SGML entities can be declared with either system identifiers ("filenames")
or public identifiers. If you use public identifers, these must be mapped
to filenames by your processing application (editor, browser, etc.).
Until recently, SGML tools all defined their own way to do this mapping
(but each tool has one). If you are having problems pulling in entities,
including the ".DTD" files and the ISO character set entities (e.g.,
isolat1.ent/.sgm), first check the entity declarations (either in the
document prolog or in the .DTD file) to make sure that 1) if they use
system identifiers that they are correct for your system and 2) if they
use public identifiers that you have set up the mapping correctly.
Unfortunately, few SGML tools today provide convenient user interfaces for
managing the mapping of public identifiers. You usually have to modify
a configuration file. For example, for Author/Editor this is the extid.map
file, normally in the /ae directory. For DynaText, this is the map.txt
file.
The SGML Open consortium has defined a standard for entity mapping files,
the SGML Open Entity Catalog specification. Many tools now support
SGML Open catalogs. The SGML Open catalog file is usually called
"catalog" (no extension). It is a text file that you can edit.
Most problems people have with configuring SGML tools, after style sheets,
involve entity resolution.
Also, if your problem is primarily with pulling in ISO entity sets,
one easy short-term solution is just to comment out the references
to the entity sets--it usually won't prevent the document from processing,
although some entities might not be resolved. This can at least get
a document up and going, letting you figure out how to pull in the entities
later.
3. SGML provides a number of "markup minimization" features that make
it easier to create SGML documents by hand. However, these features
are optional and not all tools support them. For example, Panorama
does not support most forms of markup minimization.
For this reason, you need to be sure that you do not try to process
minimized documents with tools that don't support it. When you do, you
may get unexpected or confusing error messages (for example, Panorama
will usually die and report "element nesting too deep").
You can take any SGML document and "normalize" it using James Clark's
SPAM or SGMLNORM tools, both provided with the SP package (again,
see www.jclark.com). SGMLNORM takes as input an SGML document that
uses markup minimization and produces as output the same document with
all the omitted markup added in.
You will also get the same effect by importing documents into editors
that always produce normalized documents, such as Author/Editor
and ADEPT*Editor (and probably Wordperfect 7 as well) and then
saving them as SGML.
As I said in an earlier post, when you run into a tool that seems to be
making things harder than they should be (because they don't make it easy
to manage entity mappings, for example), complain to the vendor and offer
suggestions for how they could make your life easier.
Cheers,
Eliot
--
W. Eliot Kimber ([log in to unmask])
Senior SGML Consultant and HyTime Specialist
Passage Systems, Inc., (512)339-1400
10596 N. Tantau Ave., Cupertino, CA 95014-3535 (408) 366-0300, (408)
366-0320 (fax)
2608 Pinewood Terrace, Austin, TX 78757 (512) 339-1400 (fone/fax)
http://www.passage.com (work) http://www.drmacro.com (home)
"If I never had existed, would you still remember me?..."
--Austin Lounge Lizards, "1984 Blues"
|