Re: [hackers] [st][patch] replace utf8strchr with wcschr

From: Laslo Hunhold <>
Date: Thu, 14 Mar 2019 11:44:18 +0100

On Thu, 14 Mar 2019 11:17:28 +0100
Jules Maselbas <> wrote:

Dear Jules,

> What about having an array of Rune to store worddelimiters and have a
> simple search function such as:
> Rune *
> utf8strchr(Rune *s, Rune u)
> {
> for (; *s; s++)
> if (*s == u)
> return s;
> return NULL;
> }
> The worddelimiters definition will become:
> Rune worddelimiters[] = { ' ', 0 };
> Which will allow adding unicode codepoint from wide char literal.
> Even if the wchar_t is 16 bits wide the constant will be stored
> into a Rune, which I belive is a 32 bits constant, and should work
> fine.

This would just be less efficient than the current solution, given
you'd have to convert everything to a Rune.

Now, to clear it up: A Rune literally is only a codepoint and just a
typedef for an (at least) 32-bit-integer. If we at any point decide to
support grapheme clusters (which can consist of multiple codepoints) in
st, we would have to implement worddelimiters as an array of arrays of

This is why I proposed the offset-idea, because you don't have to
jiggle with codepoints or Runes at runtime. We should in some way
leverage the power UTF-8 gives us in this regard.

With best regards


Laslo Hunhold <>

Received on Thu Mar 14 2019 - 11:44:18 CET

This archive was generated by hypermail 2.3.0 : Thu Mar 14 2019 - 11:48:21 CET