Re: [dev] [st utf8 3/4] Change internal character representation.

From: suigin <suigin_AT_national.shitposting.agency>
Date: Mon, 27 Apr 2015 06:51:32 -0700

On Mon, Apr 27, 2015 at 09:58:37AM +0200, Roberto E. Vargas Caballero wrote:
> Uhmmm, so do you propose don't use long arrays ever? because in
> some implementations long may be 4, but in others may be
> 8. We also should forbid int arrays for the same reason.

I would say it depends on the context. If they're small arrays, or if
you have a few ``long'' fields in a struct of which there's only a
handful of instances, it's not a big deal to use ``longs.'' Likewise,
if you have a few automatic storage variables of type ``long''
(variables passed as arguments to functions or created locally on
the stack) it's not a big deal to use ``longs''. But if you have a
large array of such values, as in the case of a terminal screen
buffer, you want to use as small as types as possible if your goal
is to conserve memory usage.

> Maybe we should send a proposal to the C commite with this
> proposal.

From Section 5.4.2.4 of C99 language standard on integer limits:

>The values given below shall be replaced by constant expressions
>suitable for use in #if preprocessing directives.

>Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall
>be replaced by expressions that have the same type as would an
>expression that is an object of the corresponding type converted
>according to the integer promotions.

>Their implementation-defined values shall be equal or greater in
>magnitude (absolute value) to those shown, with the same sign.

>...

>- maximum value for an object of type int
> INT_MAX +32767 // 215-1

>...

>- maximum value for an object of type long int
> LONG_MAX +2147483647 // 231-1

The reasons for this are due to convention and history.

As koneu mentioned, that's why the C language committee came up with the
fixed sized types in stdint.h. Using uint32_t or uint_least32_t for
large arrays of UTF-32/UCS-4 characters in C99 is probably the
recommended way forward here. If you're using C11, then you would use
char32_t from uchar.h. If you need to hold sizes of arrays or other data
structures, then you should use size_t instead of long. If you want to
be able to address memory offsets on disk or other external storage, you
would probably consider using long long or intmax_t/uintmax_t from
inttypes.h.


Received on Mon Apr 27 2015 - 15:51:32 CEST

This archive was generated by hypermail 2.3.0 : Mon Apr 27 2015 - 16:00:14 CEST