Re: [hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

From: Mattias Andrée <>
Date: Thu, 28 May 2020 13:48:18 +0200

Hi Laslo,

Looks good, and I especially like the simplification it brings for
using partially loaded strings. However, I have three minor comments:

I preferred `lut[off].mask` over `(lut[off].upper - lut[off].lower)`.
It is clearer what it means, and storing the mask in `lut` doesn't even
increase its size since it is padded anyway because `mincp` is
(atleast on x86-64 and i386) aligned to 4 bytes. An alternative,
is to use `~lut[off].lower` which I think is clearer than
`(lut[off].upper - lut[off].lower)`, but again, I prefer `lut[off].mask`.
You could also write
        *cp = s[0] - lut[off].lower;
I think this alternative is about as clear as using `lut[off].mask`.

In POSIX (but not Linux) `1 << 16` can be either 0, 1, or 2¹⁶,
since `1` is an `int` which minimum width is 16, not 32. Similarly,
`0x10FFFF` could overflow to 0xFFFF.

I think `(s[i] & 0xC0) != 0x80` is clearer than `!BETWEEN(s[i], 0x80, 0xBF)`,
but since you changed this I assume you disagree.

Mattias Andrée

On Thu, 28 May 2020 13:07:21 +0200
Laslo Hunhold <> wrote:

> On Wed, 27 May 2020 15:22:35 +0200
> Mattias Andrée <> wrote:
> Dear Mattias,
> >
> I refactored the decoder on a deeper level to improve the readability.
> It should be more or less clear from the code what is happening, and it
> was my fault back then that I wrote it more as "write-only" code to
> save time.
> Please let me know if you find any issues with the refactored (it was
> basically rewritten).
> With best regards
> Laslo
Received on Thu May 28 2020 - 13:48:18 CEST

This archive was generated by hypermail 2.3.0 : Thu May 28 2020 - 14:00:36 CEST