On Sun, 21 May 2006 22:26:10 +0200
"Anselm R. Garbe" <garbeam_AT_wmii.de> wrote:
> For 0000 0000 - 0000 07FF you use a 16bit Rune,
> for 0000 0800 - 0010 FFFF you use two 16bit Runes, that is what
> I mean with 16bit types. (I only looked at the libutf9
> implementation until now).
So, libutf9 is using UTF-16 internally? (That's what many, if not most
implementations do. They translate any UTF into UTF-16 and work with
that. It is easier than with UTF-8, especially when implementing
advanced stuff like character class lookups and character/string
mappings, but is not that hard on memory as UTF-32.)
This archive was generated by hypermail 2.2.0 : Sun Jul 13 2008 - 16:06:09 UTC