[hackers] [libgrapheme] Make lg_utf8_*() NULL-agnostic || Laslo Hunhold
commit 08b2c8e4e5222c04f3304595720d195a98ac7e8a
Author: Laslo Hunhold <dev_AT_frign.de>
AuthorDate: Tue Dec 14 14:06:23 2021 +0100
Commit: Laslo Hunhold <dev_AT_frign.de>
CommitDate: Tue Dec 14 14:06:23 2021 +0100
Make lg_utf8_*() NULL-agnostic
The special cases of NULL buffers and allocated zero-length buffers
(malloc(0) does not necessarily return NULL!) can be gracefully
handled:
lg_grapheme_nextbreak(NULL) -> 0
lg_grapheme_isbreak(cp1, cp2, NULL) -> run without state
lg_utf8_decode(NULL, 0, &cp) -> 0, cp=invalid (we consumed nothing
and the cp is invalid)
lg_utf8_encode(cp, NULL, 0) -> number of bytes needed (good for a
dry-run!)
While the lg_grapheme_*-functions already handled the cases well,
this commit amends the lg_utf8_* functions to do it.
Signed-off-by: Laslo Hunhold <dev_AT_frign.de>
diff --git a/src/utf8.c b/src/utf8.c
index fe75eaa..b21c920 100644
--- a/src/utf8.c
+++ b/src/utf8.c
_AT_@ -52,10 +52,10 @@ lg_utf8_decode(const uint8_t *s, size_t n, uint_least32_t *cp)
{
size_t off, i;
- if (n == 0) {
+ if (s == NULL || n == 0) {
/* a sequence must be at least 1 byte long */
*cp = LG_CODEPOINT_INVALID;
- return 1;
+ return 0;
}
/* identify sequence type with the first byte */
_AT_@ -145,8 +145,12 @@ lg_utf8_encode(uint_least32_t cp, uint8_t *s, size_t n)
break;
}
}
- if (1 + off > n) {
- /* specified buffer is too small to store sequence */
+ if (1 + off > n || s == NULL || n == 0) {
+ /*
+ * specified buffer is too small to store sequence or
+ * the caller just wanted to know how many bytes the
+ * codepoint needs by passing a NULL-buffer.
+ */
return 1 + off;
}
Received on Tue Dec 14 2021 - 15:13:00 CET
This archive was generated by hypermail 2.3.0
: Tue Dec 14 2021 - 15:24:32 CET