Hi again,
I did dive a bit deeper in latest unicode, and it's even worst of what I
thought.
To deal with real unicode input/output and to split it in "extended graphem
clusters" (an unicode "char"), you need a finite state machine (I guess that's
what Lalso was referering to). And it's the same for the "line returns"
handling.
Additionnaly, unicode NFC normalization is kind of useless (the one chosen for
the web), since they have forbidden pre-combined glyph for a long time, you end
up implementing NFD stuff anyway (that move was obviously malicious).
So, the real culprits are actually written languages: they suck. Namely, you
cannot write suckless code for tons of written languages, and on top of that,
simple written languages handling being generalized with some of the most complex
written languages, handling properly those simple written languages will use
the same complex/generalized definitions and mecanisms.
On the rendering side, those complex mecanisms allow font designers to spare a
good chunk of work: the one required for pre-combined glyphs. Expect in fonts
less and less pre-combined glyphs, with a uniq unicode points mapping to them,
and that even for simple written languages. And expect lighter font files.
It means there is no good real middle ground (a good middle ground in the web
would be, basic xhtml without javascript).
And st in all that?
Do like linux line discipline drivers? Namely do handle utf8 encoded
unicode code points (no extended graphem cluster) only, and actually do work on ascii?
For suckless, as a consistant whole, it means:
- It becomes an ascii only framework (Anselm, seems to like this), and will be
kind of useless for any text interacting application going beyond ascii
(i.e. no more mutt with non ascii email, no more lynx with non ascii only web
page...). A zero-i18n framework. In the case of wayland st: own
ascii bitmap fonts and own font renderer.
- suckless gets its own unicode handling code (libicu/freetype+harfbuzz
look-alike implementation).
--
Sylvain
Received on Thu Sep 27 2018 - 21:40:06 CEST