Re: [dev] [libgrapheme] Some questions about libgrapheme
 
On Fri, Sep 02, 2022 at 02:08:03PM -0300, atrtarget_AT_cock.li wrote:
> Quite inefficient really, but I guess it's fine since my usage would be
> only user input (left arrow)
If efficiency is not a concern, then you can easily use something like
this (just a quick prototype, didn't verify if it's correct or not):
        /* returns an offset into `s` */
        static size_t
        prev_char_offset(const char *s, size_t slen, size_t off)
        {
                assert(s != NULL);
                assert(slen > 0);
                assert(off <= slen);
        
                size_t ret = 0;
                const char *const end = s + slen;
                while (s < end) {
                        size_t n = grapheme_next_character_break_utf8(s, end - s);
                        if (ret + n >= off)
                                return ret;
                        ret += n;
                        s += n;
                }
                return 0; /* unreachable (?) */
        }
If I was expecting a decent amount of non-ascii input, I would use the
bitvector approach described by Thomas Oltmann. 1bit per byte overhead
should be fine for most use-cases.
- NRK
Received on Fri Sep 02 2022 - 22:21:45 CEST
This archive was generated by hypermail 2.3.0
: Fri Sep 02 2022 - 22:24:09 CEST