Print

Print


On Mon, Jun 28, 2004 at 04:52:50PM +0200, Adam Dickmeiss wrote:
> All hex codes (when given as &#..) must be part of UNICODE charset. And
> ESC and some others in range 0x01-0x1f aren't valid.

I think they are. From Unicode 3.0 spec, 2.8:
> Control characters
> The Unicode Standard provides 65 code values for the representation of
> control characters. These ranges are U+0000..U+001F and U+007F..U+009F
> [...]
>
> Escape sequences
> In converting text containing escape sequences to the Unicode
> character encoding, text must be converted to the equivalent Unicode
> characters. Converting escape sequences into Unicode characters on a
> character-by-character basi (for instance ESC-A turns into U+001B
> ESCAPE, U+0041 LATIN CAPITAL LETTER A) allows the reverse conversion
> to be perfomed without forcing the conversion program to recognize the
> escape sequence as such.
>
> Control Code Sequences Encoding Additional Information about Text
> If a system uses sequences beginning with control codes to embed
> additional information about text (such as formatting attributes or
> structure), the such sequences form a higher-level protocol outside
> the scope of the Unicode Standard.



--
Heikki Levanto    heikki at indexdata dot dk   "In Murphy We Turst"