Re: [hackers] [PATCH 1/1] paste: Support -d '\0'

From: Michael Forney <mforney_AT_mforney.org>
Date: Sat, 21 Mar 2020 14:43:35 -0700

On 2020-03-09, Richard Ipsum <richardipsum_AT_vx21.xyz> wrote:
> POSIX specifies that -d '\0' sets the delimiter to an empty string.

Hi Richard,

Sorry for the delay on the review. This mostly looks good. Just a few
questions/comments.


> diff --git a/libutf/utf.c b/libutf/utf.c
> index 897c5ef..cf46e57 100644
> --- a/libutf/utf.c
> +++ b/libutf/utf.c
> _AT_@ -62,6 +62,18 @@ utfnlen(const char *s, size_t len)
> return i;
> }
>
> +size_t
> +utfmemlen(const char *s, size_t len)
> +{
> + const char *p = s;
> + size_t i;
> + Rune r;
> +
> + for(i = 0; p - s < len; i++)
> + p += chartorune(&r, p);
> + return i;
> +}
> +
> char *
> utfrune(const char *s, Rune r)
> {
> diff --git a/libutf/utftorunestr.c b/libutf/utftorunestr.c
> index 005fe8a..5da9d5f 100644
> --- a/libutf/utftorunestr.c
> +++ b/libutf/utftorunestr.c
> _AT_@ -11,3 +11,15 @@ utftorunestr(const char *str, Rune *r)
>
> return i;
> }
> +
> +int
> +utfntorunestr(const char *str, size_t len, Rune *r)
> +{
> + int i, n;
> + const char *p = str;
> +
> + for(i = 0; (n = chartorune(&r[i], p)) && p - str < len; i++)
> + p += n;
> +
> + return i;
> +}

I have a slight concern here (and in utfmemlen) that if the string
ends with a partial UTF-8 sequence or len == 0, we may read past the
end of the buffer. Perhaps we should use charntorune here?

> diff --git a/libutil/unescape.c b/libutil/unescape.c
> index d8ed2a2..deca948 100644
> --- a/libutil/unescape.c
> +++ b/libutil/unescape.c
> _AT_@ -21,7 +21,8 @@ unescape(char *s)
> ['n'] = '\n',
> ['r'] = '\r',
> ['t'] = '\t',
> - ['v'] = '\v'
> + ['v'] = '\v',
> + ['0'] = '\0'

I think this is not necessary. It should be handled by the octal
escape handling below.
Received on Sat Mar 21 2020 - 22:43:35 CET

This archive was generated by hypermail 2.3.0 : Sun Mar 22 2020 - 00:00:51 CET