Re: [dev] [libgrapheme] Some questions about libgrapheme

From: Thomas Oltmann <>
Date: Fri, 2 Sep 2022 21:19:46 +0200

Hi atrtarget,

I thought I'd chip in my two cents.

1. Regarding stepping backwards throught the graphemes:

As Laslo explained, trying to find the starting point of the previous
grapheme is simply not possible.
In your situation, if scanning from the front of the string is too
inefficient for you, you could try keeping
a bitfield in addition to the string, with one bit for each char of the string.
A 1 in the bitfield means 'this char is the start of a new grapheme',
0 is the opposite.
Every time the string changes, the bitfield is recomputed.
This way, moving the cursor left or right in a text editor is just a
matter of finding the next
or previous set bit in the bitfield, which is extremely cheap.

2. Regarding the avoidance of terminal linewrap:

AFAIK there's no proper way to query the display width of a character.
It definitely depends on the font though.
I guess the only robust approach is to render the character on the
terminal, and then read back by how much the
cursor was advanced.
So perhaps you could try to render the whole line, detect when a line
overflow happens in the terminal based on
the cursor position, and then react accordingly.
It would be interesting to know how (or even if!) other software such
as tmux or vim has solved this issue.


On Fri, Sep 2, 2022 at 7:08 PM <> wrote:
> Thank you a lot for spending some time answering!
> > The problem with this heuristic is that the algorithm can become very
> > inefficient, especially when you have long preceding segments. If n is
> > the offset-length, the worst-case runtime could be O((n-1)!) for a
> > segment that is in fact of length n-1, because of the single backsteps
> > it has to take.
> Quite inefficient really, but I guess it's fine since my usage would be
> only user input (left arrow)
> > The proper way to solve the column-problem is to render each grapheme
> > cluster and see how wide the font-rendering-library renders it, given
> > it depends on the font. I know that this isn't satisfactory, but that's
> > how it is.
> In the case of a terminal would this mean asking for the position of the
> cursor after every character I print? My usage would be to avoid
> terminal
> induced soft-wraps in a text editor.
> Anyway, thanks again for the help!
Received on Fri Sep 02 2022 - 21:19:46 CEST

This archive was generated by hypermail 2.3.0 : Fri Sep 02 2022 - 21:36:08 CEST