Strake dixit:
>On 26/11/2013, Silvan Jegen <s.jegen_AT_gmail.com> wrote:
>> If you you would rather not take this version, what approach would
>> you take for the character set mapping when using UTF-8?
>
>On Linux, one can easily make a sparse array with 1-page granularity
>with mmap, and so simply use a (wchar_t []) or (Rune []), but I'm not
>sure how portable this is.
Pretty portable, and 2²¹ * sizeof(wchar_t)/CHAR_BITS is at best 2²⁵
or 32 MiB, so this would even work.
But common, for Unicode, is to use the planes.
struct {
wchar_t foo[0x100];
} *repl[0x1100];
Do note that sizeof(wchar_t) may be 16, and that the OS’ own
representation of wchar_t may not be Unicode, so the type would
be semantically wrong.
You might want to use uint32_t there.
bye,
//mirabilos
--
„Also irgendwie hast du IMMER recht. Hier zuckelte gerade ein Triebwagen mit
der Aufschrift "Ostdeutsche Eisenbahn" durch Wuppertal. Ich glaubs machmal
nicht…“ -- Natureshadow, per SMS
„Hilf mir mal grad beim Denken“ -- Natureshadow, IRL, 2x
Received on Tue Nov 26 2013 - 23:40:20 CET