Re: [hackers] [st][patch] replace utf8strchr with wcschr

From: Lauri Tirkkonen <lotheac_AT_iki.fi>
Date: Fri, 15 Mar 2019 08:27:56 +0200

On Thu, Mar 14 2019 11:44:18 +0100, Laslo Hunhold wrote:
> On Thu, 14 Mar 2019 11:17:28 +0100
> Jules Maselbas <jmaselbas_AT_kalray.eu> wrote:
> > What about having an array of Rune to store worddelimiters and have a
> > simple search function such as:
> >
> > Rune *
> > utf8strchr(Rune *s, Rune u)
> > {
> > for (; *s; s++)
> > if (*s == u)
> > return s;
> > return NULL;
> > }
> >
> > The worddelimiters definition will become:
> >
> > Rune worddelimiters[] = { ' ', 0 };
> >
> > Which will allow adding unicode codepoint from wide char literal.
> > Even if the wchar_t is 16 bits wide the constant will be stored
> > into a Rune, which I belive is a 32 bits constant, and should work
> > fine.
>
> This would just be less efficient than the current solution, given
> you'd have to convert everything to a Rune.

I don't understand your logic. The current solution *is* converting
everything to a Rune.

        static char *utf8strchr(char *, Rune);

worddelimiters is char *, but utf8strchr() calls utf8decode() on it to
obtain Runes (to compare to the second argument). While I don't think
efficiency actually matters a lot here since this is only called when
you double-click to select something, Jules' solution is quite similar
to mine in that the worddelimiters string needs no conversion at
runtime, and therefore more efficient than the current one.

> Now, to clear it up: A Rune literally is only a codepoint and just a
> typedef for an (at least) 32-bit-integer.

Yes, and yet Rune values are still being passed to wcwidth() in the
current code. You objected to wchar_t on grounds of portability, but
already the current code is broken on platforms where wchar_t is less
than 32 bits, or its values do not match Unicode codepoints. I hope you
will not suggest replacing wcwidth() with an application-local character
width table.

-- 
Lauri Tirkkonen | lotheac _AT_ IRCnet
Received on Fri Mar 15 2019 - 07:27:56 CET

This archive was generated by hypermail 2.3.0 : Fri Mar 15 2019 - 07:36:22 CET