Re: [dev] [st] wide characters

From: Thorsten Glaser <tg_AT_mirbsd.de>
Date: Mon, 15 Apr 2013 19:54:03 +0000 (UTC)

random832_AT_fastmail.us dixit:

>Those systems aren't using wchar_t *or* wint_t for unicode, though.

Do not assume that.

tg_AT_blau:~ $ echo '__STDC_ISO_10646__ / __WCHAR_TYPE__ , __WCHAR_MAX__' | cc -E -
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "<stdin>"
200009L / short unsigned int , 65535U

>The main reason for wint_t's existence is that wchar_t isn't guaranteed
>to be able to represent a WEOF value distinct from all valid character

Right.

>You can use #if __STDC_ISO_10646__ to test whether the implementation
>uses unicode for wchar_t (most modern systems do, though some may not
>define this constant)

I think most do not define this constant…

> - if so, then wchar_t is, naturally, guaranteed to
>be able to represent at least the range 0 to 0x10FFFF

Nope. But systems using 16 bit may not rise past 200009L
even if they do otherwise support newer Unicode stuff.

This works very well by the way: (wchar_t)-1 and (wchar_t)-2
aren’t Unicode characters anyway, and it allows for relatively
easy conversion of legacy software, such as BSD tr (which uses
tables), to Unicode.

I should know, I implemented it for this purpose ;-)

bye,
//mirabilos
-- 
Sometimes they [people] care too much: pretty printers [and syntax highligh-
ting, d.A.] mechanically produce pretty output that accentuates irrelevant
detail in the program, which is as sensible as putting all the prepositions
in English text in bold font.	-- Rob Pike in "Notes on Programming in C"
Received on Mon Apr 15 2013 - 21:54:03 CEST

This archive was generated by hypermail 2.3.0 : Mon Apr 15 2013 - 22:00:07 CEST