[hackers] [libgrapheme] Add a remark on standard conformance in README || Laslo Hunhold

From: <git_AT_suckless.org>
Date: Wed, 22 Dec 2021 15:20:49 +0100 (CET)

commit 42e58c7d3a921540f5d901b80a0cc75e234b02e9
Author: Laslo Hunhold <dev_AT_frign.de>
AuthorDate: Wed Dec 22 15:20:27 2021 +0100
Commit: Laslo Hunhold <dev_AT_frign.de>
CommitDate: Wed Dec 22 15:20:27 2021 +0100

    Add a remark on standard conformance in README
    
    Signed-off-by: Laslo Hunhold <dev_AT_frign.de>

diff --git a/README b/README
index 3b82a29..4e6ee44 100644
--- a/README
+++ b/README
_AT_@ -7,6 +7,13 @@ up of user-perceived characters (so-called "grapheme clusters") that are
 made up of one or more Unicode codepoints, which in turn are encoded in
 one or more bytes in an encoding like UTF-8.
 
+There is a widespread misconception that it was enough to simply
+determine codepoints in a string and treat them as user-perceived
+characters to be Unicode compliant. While this may work in some cases,
+this assumption quickly breaks, especially for non-Western languages and
+decomposed Unicode strings where user-perceived characters are usually
+represented using multiple codepoints.
+
 Despite the complicated multilevel structure of Unicode strings,
 libgrapheme provides methods to work with them at the byte-level (i.e.
 UTF-8 ‘char’ arrays) while also providing codepoint-level methods.
_AT_@ -28,6 +35,19 @@ Afterwards enter the following command to build and install libgrapheme
 
         make install
 
+Conformance
+-----------
+The libgrapheme library is compliant with the Unicode 14.0.0
+specification (September 2021).
+
+To ensure conformance, libgrapheme includes hundreds of tests including
+all provided with the standard-provided test-data that is parsed
+automatically. The tests can be run with
+
+ make test
+
+to check standard conformance.
+
 Usage
 -----
 Include the header grapheme.h in your code and link against libgrapheme
diff --git a/man/grapheme_decode_utf8.3 b/man/grapheme_decode_utf8.3
index 2536e72..0ca91eb 100644
--- a/man/grapheme_decode_utf8.3
+++ b/man/grapheme_decode_utf8.3
_AT_@ -1,4 +1,4 @@
-.Dd 2021-12-19
+.Dd 2021-12-22
 .Dt GRAPHEME_DECODE_UTF8 3
 .Os suckless.org
 .Sh NAME
diff --git a/man/grapheme_encode_utf8.3 b/man/grapheme_encode_utf8.3
index 5e51ac2..cf90c5b 100644
--- a/man/grapheme_encode_utf8.3
+++ b/man/grapheme_encode_utf8.3
_AT_@ -1,4 +1,4 @@
-.Dd 2021-12-17
+.Dd 2021-12-22
 .Dt GRAPHEME_ENCODE_UTF8 3
 .Os suckless.org
 .Sh NAME
diff --git a/man/grapheme_is_character_break.3 b/man/grapheme_is_character_break.3
index 507842c..f50eee3 100644
--- a/man/grapheme_is_character_break.3
+++ b/man/grapheme_is_character_break.3
_AT_@ -1,4 +1,4 @@
-.Dd 2021-12-18
+.Dd 2021-12-22
 .Dt GRAPHEME_IS_CHARACTER_BREAK 3
 .Os suckless.org
 .Sh NAME
diff --git a/man/grapheme_next_character_break.3 b/man/grapheme_next_character_break.3
index 962b2ce..9e0245b 100644
--- a/man/grapheme_next_character_break.3
+++ b/man/grapheme_next_character_break.3
_AT_@ -1,4 +1,4 @@
-.Dd 2021-12-18
+.Dd 2021-12-22
 .Dt GRAPHEME_NEXT_CHARACTER_BREAK 3
 .Os suckless.org
 .Sh NAME
diff --git a/man/libgrapheme.7 b/man/libgrapheme.7
index 2d33112..5d96e49 100644
--- a/man/libgrapheme.7
+++ b/man/libgrapheme.7
_AT_@ -1,4 +1,4 @@
-.Dd 2021-12-19
+.Dd 2021-12-22
 .Dt LIBGRAPHEME 7
 .Os suckless.org
 .Sh NAME
_AT_@ -18,11 +18,22 @@ see
 that are made up of one or more Unicode codepoints, which in turn
 are encoded in one or more bytes in an encoding like UTF-8.
 .Pp
+There is a widespread misconception that it was enough to simply
+determine codepoints in a string and treat them as user-perceived
+characters to be Unicode compliant.
+While this may work in some cases, this assumption quickly breaks,
+especially for non-Western languages and decomposed Unicode strings
+where user-perceived characters are usually represented using multiple
+codepoints.
+.Pp
 Despite this complicated multilevel structure of Unicode strings,
 .Nm
 provides methods to work with them at the byte-level (i.e. UTF-8
 .Sq char
 arrays) while also offering codepoint-level methods.
+.Pp
+Every documented function's manual page provides a self-contained
+example illustrating the possible usage.
 .Sh SEE ALSO
 .Xr grapheme_decode_utf8 3 ,
 .Xr grapheme_encode_utf8 3 ,
Received on Wed Dec 22 2021 - 15:20:49 CET

This archive was generated by hypermail 2.3.0 : Wed Dec 22 2021 - 15:24:34 CET