=================================================================
GENERAL ARTICLES =================================================================
Training Golded to play UTF-8 Part 2
By Michiel van der Vlist, 2:280/5555
Golded has limited support for reading and writing messages in UTF-8.
There are two ways. One is using an external editor. The other way
is to use translation tables. Last week was about the first method.
This week is about using translation tables.
Using translation tables.
=========================
Golded uses 8 bit translation tables to convert one character set
into another. Oddly enough Golded ignores the current code page
setting and always uses the code page in effect when the system is
booted up. At least that is how it works in Windows. Linux or OS/2
may be different. So if the systems boots up in CP850 all you will
ever see in the Golded screen are the characters in the CP850 set.
The character encoding scheme in the messages can be different, but
the gliphs are limited to those in the CP850 set.
Add this to your golded.cfg
XLATCHARSET CP850 UTF-8 850_UTF8.CHS
XLATCHARSET UTF-8 CP850 UTF8_850.CHS
GROUP UTF8
xlatimport utf-8
member utf-8
xlatexport utf-8
origin UTF-8 enthousiast
ENDGROUP
If you do not already have 850_UTF8.CHS and UTF8_850.CHS you can get
them from my system by file request or
http://www.vlist.eu/downloads/
Oddly enough the Golded translation mechanism allows for translating
one byte into one or more bytes but not the other wat around. As a
result all characaters in CP850 can be corectly translated into
UTF-8 but the translation fom UTF-8 into CP850 is very, very limited.
Actually it only works for characters in the range u+00C0 - u+00FF.
This covers the accents and umlauts but not much more.
If your system's native character set is not CP850 you will need
translation tables to and from that character set.
The pros and cons of both methods.
==================================
External editor Translation tables
Pros
Support for any unicode character Easy installation.
that the external editor and the Editing and formatting text
underlying OS support. works as one is used to.
Cons
External editors normally only The set of usable characters is
support entering text "lines" limited to those in the codepage
separated by a line seperator. installed at startup of the sys-
Fidonet text consists of "para- tem. (CP850 for Western Europe).
graphs" seperated by a CR. The Incoming non ASCII limited to
two methods do not mix well. u+00C0 - u+00FF.
-----------------------------------------------------------------
--- Azure/NewsPrep 3.0
* Origin: Home of the Fidonews (2:2/2.0)