Re: [dev] [st] UTF-8 not working from Страхиња Радић on 2022-04-28 (dev mail list archive)

From: Страхиња Радић <contact_AT_strahinja.org>
Date: Thu, 28 Apr 2022 19:44:44 +0200

On 22/04/28 06:48, Страхиња Радић wrote:
> May I ask what shell are you using inside st? The only problem I noticed so far
> with my script, which uses xdotool(1) to type characters, is when I am using it
> while st is specifically executing mksh as a shell. With bash, dash and zsh
> emoji are inserted correctly. This is undoubtedly some misconfiguration of mksh
> on my part, which I have yet to figure out in detail.

After some investigation, I discovered the following paragraph in mksh FAQ[1]:

> The shell’s utf8-mode before mksh R60 supported only the BMP (Basic
> Multilingual Plane) of UCS and mapped raw (extended ASCII) octets, i.e. these
> which are not valid UTF-8 BMP codepoints) into the U+EF80‥U+EFFF range, which
> is allocated at the CSUR for this purpose. (It otherwise lies in the PUA;
> however, there is ambiguity if encountering those UTF-8-encoded, so it changed
> for R60.) The Arithmetic expressions and CAVEATS sections in mksh(1) contain
> more details about encoding and mapping.
>
> As of R60, utf8-mode maps “raw octets” to U-10000080‥U-100000FF, which is
> outside the UCS and therefore collision-free. There’s work underway to make the
> shell support the full 21-bit UCS range for R60.

Since I'm currently using mksh R59, that part of the mystery is solved as well.

**Definitive conclusion: st does not need GNOME, ibus or other bloat (aside from
good old native X.Org bloat itself) to support UTF-8 input/output.**

[1]: http://www.mirbsd.org/mksh-faq.htm#posix-mode

application/pgp-signature attachment: signature.asc

Received on Thu Apr 28 2022 - 19:44:44 CEST

This archive was generated by hypermail 2.3.0 : Thu Apr 28 2022 - 19:48:07 CEST