• UTF-8 question

    From August Abolins@2:221/1.58 to Martin Foster on Sun Dec 29 17:17:00 2019
    Hello Martin!

    I think we had a little chat about UTF-8 before (can't remember), but why
    is it that OXP doesn't display utf-8 chars all that well?

    A recent post in UTF-8 echo has a collection of the phrase "Happy New
    Year" from world languages. I wouldn't expect Arabic, Cantonese and some
    of the other "graphic" chars to render in OXP. But would have thought
    that at least French, German, and even Spanish could be rendered in the
    DOS world of OXP just fine, but they are not.

    This is what I see when reading the post in UTF-8 echo:

    http://pics.rsh.ru/img/oxputf8_6d4jal7j.jpg




    ../|ug

    --- OpenXP 5.0.40
    * Origin: o,,,,o§ø`ø§o,,,,o§ø`ø§o,,,,o (2:221/1.58)
  • From Martin Foster@2:310/31.3 to August Abolins on Tue Dec 31 09:18:00 2019
    Hello August!

    * 29.12.19 at 17:17, August Abolins wrote to Martin Foster:

    I think we had a little chat about UTF-8 before (can't remember), but why is it that OXP doesn't display utf-8 chars all that well?

    Yes we did and the thread was started by your good self on 4th April
    2019 entitled "oxp: UTF-8". Some very interesting stuff came out of
    that thread and in particular, the developers comments which I posted
    in a message dated 5th April 2019.

    Regards,
    Martin

    --- OpenXP 5.0.42
    * Origin: Bitz-Box - Bradford - UK (2:310/31.3)
  • From August Abolins@2:221/1.58 to Martin Foster on Tue Dec 31 18:08:00 2019
    Hello Martin!

    ** 31.12.19 - 09:18, Martin Foster wrote to August Abolins:

    I think we had a little chat about UTF-8 before (can't remember),
    but why is it that OXP doesn't display utf-8 chars all that well?

    Yes we did and the thread was started by your good self on 4th April
    2019 entitled "oxp: UTF-8". Some very interesting stuff came out of
    that thread and in particular, the developers comments which I posted
    in a message dated 5th April 2019.

    Thank you for the reminder. I reviewed those messages.

    Being a DOS program, oxp's UTF-8 results are highly dependent on the Font
    set in use and the limits of that font's support. I thought Lucinda
    Console font would solve all the problems, but it does not.

    In other news, I am still baffled how oxp decides to use IBM437 and when
    to use UTF-8, or even when it just uses US-ASCII for outgoing messages.

    Are key chars triggering the selection?


    ../|ug

    --- OpenXP 5.0.42
    * Origin: o,,,,o§ø`ø§o,,,,o§ø`ø§o,,,,o (2:221/1.58)
  • From Mark Lewis@1:3634/12 to August Abolins on Wed Jan 1 08:40:04 2020
    Re: UTF-8 question
    By: August Abolins to Martin Foster on Tue Dec 31 2019 18:03:00


    Being a DOS program, oxp's UTF-8 results are highly dependent on the Font set in use and the limits of that font's support. I thought Lucinda
    Console font would solve all the problems, but it does not.

    remember that a lot of fonts still have only 256 slots in them... some few have 65535 slots and can hold a lot more character glyphs that the 256 slotted ones... the full UTF-8 character set contains 2,097,152 characters and that may not even be everything...

    this question and comments on stackoverflow may help...

    https://stackoverflow.com/questions/10229156/how-many-characters-can-utf-8-enco de

    then there's also this wikipedia page...

    https://en.wikipedia.org/wiki/UTF-8

    hope this helps some...


    )\/(ark
    --- SBBSecho 3.10-Linux
    * Origin: SouthEast Star Mail HUB - SESTAR (1:3634/12)
  • From August Abolins@2:221/1.58 to Mark Lewis on Wed Jan 1 11:33:00 2020
    Hello mark!

    ** 01.01.20 - 08:35, mark lewis wrote to August Abolins:

    .. I thought Lucinda Console font would solve all the problems, but it
    does not.

    remember that a lot of fonts still have only 256 slots in them... some
    few have 65535 slots and can hold a lot more character glyphs that
    the 256 slotted ones... the full UTF-8 character set contains
    2,097,152 characters and that may not even be everything...


    Thanks for the stackoverflow and wikipedia links.


    hope this helps some...

    Yes, it does! The whole science/math behind font sets impresses me.

    Having worked briefly on coding for UI for custom displays in the past was
    a lot of fun. At that time I had specific sets of pre-designed elements
    and rules to follow and could even design my own chars.

    In the case of Win32/DOS based OpenXP, the limitation is the existing font set: Lucida Console. Lucida Console "supports" many of the extended chars and some foreign language chars. But OpenXP adds further limitations. :(

    OpenXP adds a CHARset kludge: ASCII 1, US-ASCII, IBMPC 2, ..but it seems
    to select each one automatically based on content it detects. ???

    It can be configured to interpret a subset of UTF-8 on incoming messages,
    but it never generates the UTF-8 kludge for outgoing messages/replies to match.

    OpenXP allows launching an external editor. Maybe I can explore its UTF-8 support with the famous multi-charset GoatEd (now gossiped?) or something.
    Is there a ready-made win32 version of it?

    I like the way Thunderbird can be configured to use a specific charset (it uses the term "character encoding"). There, the UTF-8 setting covers a
    broad range of characters for proper display.

    Take care.. Have a great day!


    ../|ug

    --- OpenXP 5.0.42
    * Origin: o,,,,o§ø`ø§o,,,,o§ø`ø§o,,,,o (2:221/1.58)
  • From August Abolins@2:221/1.58 to Mark Lewis on Wed Jan 1 11:51:00 2020
    Hello mark!

    ** 01.01.20 - 11:28, August Abolins wrote to mark lewis:

    OpenXP allows launching an external editor. Maybe I can explore its
    UTF-8 support with the famous multi-charset GoatEd (now gossiped?) or
    something. Is there a ready-made win32 version of it?


    Nevermind. I read that gossiped needs to work with msg/opus, jam, or
    squish message bases directly.


    ../|ug

    --- OpenXP 5.0.42
    * Origin: o,,,,o§ø`ø§o,,,,o§ø`ø§o,,,,o (2:221/1.58)
  • From Martin Foster@2:310/31.3 to August Abolins on Sun Jan 5 08:58:00 2020
    Hello August!

    * 31.12.19 at 18:03, August Abolins wrote to Martin Foster:

    [snip]
    In other news, I am still baffled how oxp decides to use IBM437 and when to use UTF-8, or even when it just uses US-ASCII for outgoing messages.

    Yeah, me too.

    Are key chars triggering the selection?

    No idea, sorry.

    Regards,
    Martin

    --- OpenXP 5.0.42
    * Origin: Bitz-Box - Bradford - UK (2:310/31.3)