Re: [dev] [libgrapheme] announcement

From: Laslo Hunhold <dev_AT_frign.de>
Date: Sat, 28 Mar 2020 00:41:51 +0100

On Sat, 28 Mar 2020 00:32:24 +0100
Mattias Andrée <maandree_AT_kth.se> wrote:

Dear Mattias,

> This sounds absolutely horrible. Non-pre-composed characters are not
> widely well support and are often rendered terribly, some software
> (like the Linux VT) cannot even rendering them.

yes, the Linux VT is a good example. To really do the rendering
properly, you need a font-library that basically has infinite context
to draw characters.
This is not possible in a terminal, but you can at least "reserve" one
block for one grapheme cluster.
To put it another way, though: It's not the problem of the application,
but of the font renderer, and with complications in the TTF it only
gets more and more complicated.

> Why is even the kernel getting into encoding issues?, that should be
> an application issue, not a kernel issue. A kernel should only know
> bytes. Is it really a security issue?

I like to compare it to IDN homograph attacks, where you replace
characters like the letters a and e with homographs, in this case those
from the cyrillic alphabet а and е (they are not the same, even though
it looks like it!).
It doesn't take much creativity to see that it's enough to register a
domain
        https://аmаzon.com/
and trick people into visiting it. In Firefox, you can tell it to
properly display all URLs with expanded non-ASCII forms, in this case
as
        https://xn--mzon-43db.com/
and I can only recommend that to people.

In the case of file systems, I would probably go with a comparable
approach and in the kernel only work byte-wise, but when listing a
directory with to equivalent file names, I would print both in such an
expanded form. This would keep both sides happy.

With best regards

Laslo
Received on Sat Mar 28 2020 - 00:41:51 CET

This archive was generated by hypermail 2.3.0 : Sat Mar 28 2020 - 00:48:09 CET