Hello mark,
On Tuesday June 09 2020 20:25, you wrote to me:
MvdV>> Instead of an 'H' we see an 'i' with accent grave. The infamous
MvdV>> hex 8D in CP437.
Note the use of te word "infamous".
that partucular character has problems several ways... in this case, though, it is known as the soft_cr...
I know that <sigh>
This what I wrote in Fidonews Vol 29, nr 1 in an article titled "A PLEA FOR UTF-8 IN FIODONET Part 2" (Part one was published in Vol 28, nr 52)
=== quote ===
So what else would we need to make FidoNet work with UTF-8? As far as
the transport layer is concerned nothing really. FidoNet is fully 8
bit transparent except for the NULL as the terminating character for
strings. There is no conflict as in UTF-8 the NULL character has the
exact same meaning as in ASCII. Oh wait, there is this tiny little
snag: the archaic soft return. In their infinite wisdom, the founding
fathers decided that the character 0x8D had special meaning; that of
soft return. Probably a remnant from the Wordstar days. In hindsight
totally superfluous and a conflict with many code page schemes that
treat it as a printable character. It also conflicts with UTF-8. 0x8D
is a valid byte in a well formed UTF-8 string. Fortunately most bronze
age software allows configuring 0x8D as a printable character instead
of soft return, so this should no longer be a problem. Be sure however
to configure your tosser to not strip soft returns.
=== unquote ===
so if we strip it, then we break other languages... german is one,
German is hardly affected. The most used encodings in Germany are CP850 and Latin-1. In CP850 it is the 'i' with accent grave which is not used in German. In Latin-1 it is a control code R1, whatever that means. The Russians are most affected as in Cyrillic CP866 it is the capital letter pronunced as the 'N' in "Putin" as I already explained.
In UTF-8 it affects more than one language as it can occur in a well formed UTF-8 sequence.
IIRC... if we leave it, some of today's readers/editors will show it instead of word wrapping on it and not displaying it at all...
Those reader have long been fased out in this part of the world. Mainly because it is considered a printable character by most. Golded has the option: DISPSOFTCR yes
Tossers should never strip it. Period. Althoug Fmail still has an option to do so. Possibly for historic reasons.
this BBS editor has it as an option... i'll flip it after writing this reply and we'll see how my future messages look...
Your editor did not strià it or I would not have seen the 'i' with ccent gr ve in your mess ge.
But you re ignoring the àoint. Which w s that your softw re wrongly m rks your reply to my message as CP437. It should haven been CP866.
I h ve hidden, some more e ster eggs, this time not involving hex 8D.
Enjoy.
Cheers, Michiel
--- GoldED+/W32-MSVC 1.1.5-b20170303
* Origin:
http://www.vlist.eu (2:280/5555)