Re: [hackers] [lchat] use libgrapheme instead of libutf || Jan Klemkow

From: Jan Klemkow <j.klemkow_AT_wemelug.de>
Date: Mon, 3 Oct 2022 00:27:33 +0200

Hi Laslo,

On Sun, Oct 02, 2022 at 02:37:12AM +0200, Laslo Hunhold wrote:
> On Sun, 2 Oct 2022 02:01:34 +0200 (CEST)
> git_AT_suckless.org wrote:
>
> > commit dbc8751dc6c034967d2b3133a58a627834992e8c
> > Author: Jan Klemkow <j.klemkow_AT_wemelug.de>
> > AuthorDate: Sun Oct 2 00:59:19 2022 +0200
> > Commit: Jan Klemkow <j.klemkow_AT_wemelug.de>
> > CommitDate: Sun Oct 2 01:00:03 2022 +0200
> >
> > use libgrapheme instead of libutf
>
> thanks for putting forward the trust and using libgrapheme for your
> application!

This task was on my list for some time. I were just to lazy to do it,
till now :)

> I am currently in the process of heavy refactorization in preparation
> of version 2 (I want to put the code on much more formally-verifiable
> fotting), but version 1 is generally stable and there are no known
> bugs.

I just ported libgraphme-1 to OpenBSD. I already have an OK to commits
this after the 7.2 release [1]. Thus, libgrapheme will be available in
OpenBSD 7.3. If you release a newer version, I will update the port.

[1]: https://marc.info/?l=openbsd-ports&m=166409311518680&w=2

> Until now, I refactored the case-, character- and line-functions and
> they are working perfectly, which the unit-tests reflect. The word- and
> sentence-functions have more complex state-handling that requires me to
> think of a fitting data-structure, and I know of some edge-cases where
> they might fail (e.g. NUL-terminated strings) given the iffy
> index-jiggling.
>
> But given you're only using character-break-checks, you're safe. There
> will however be an API-change with version 2 where
>
> grapheme_next_character_break()
>
> is renamed to
>
> grapheme_next_character_break_utf8().
>
> I know that such changes are always a bad thing and I gave it a lot of
> thought, but it's better to change now, where only very few projects
> use the library, instead of having to carry this as legacy cruft into
> the future.

I don't worry about an API change. But, why do make the function names
so long? And why do you extend with "_utf8"? Function names in C are
much shorter in general. For instance, grph_nxt_char_brk() would be
more handy to use.

bye,
Jan
Received on Mon Oct 03 2022 - 00:27:33 CEST

This archive was generated by hypermail 2.3.0 : Mon Oct 03 2022 - 00:36:38 CEST