Re: [dev] [sbase][RFC] Add a simplistic version of tr

From: Thorsten Glaser <tg_AT_mirbsd.de>
Date: Tue, 26 Nov 2013 22:40:20 +0000 (UTC)

Strake dixit:

>On 26/11/2013, Silvan Jegen <s.jegen_AT_gmail.com> wrote:
>> If you you would rather not take this version, what approach would
>> you take for the character set mapping when using UTF-8?
>
>On Linux, one can easily make a sparse array with 1-page granularity
>with mmap, and so simply use a (wchar_t []) or (Rune []), but I'm not
>sure how portable this is.

Pretty portable, and 2²¹ * sizeof(wchar_t)/CHAR_BITS is at best 2²⁵
or 32 MiB, so this would even work.

But common, for Unicode, is to use the planes.

struct {
        wchar_t foo[0x100];
} *repl[0x1100];

Do note that sizeof(wchar_t) may be 16, and that the OS’ own
representation of wchar_t may not be Unicode, so the type would
be semantically wrong.

You might want to use uint32_t there.

bye,
//mirabilos
-- 
„Also irgendwie hast du IMMER recht. Hier zuckelte gerade ein Triebwagen mit
der Aufschrift "Ostdeutsche Eisenbahn" durch Wuppertal. Ich glaubs machmal
nicht…“						-- Natureshadow, per SMS
„Hilf mir mal grad beim Denken“			-- Natureshadow, IRL, 2x
Received on Tue Nov 26 2013 - 23:40:20 CET

This archive was generated by hypermail 2.3.0 : Wed Nov 27 2013 - 00:00:14 CET