Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

From: <random832_AT_fastmail.us>
Date: Fri, 09 Jan 2015 17:55:04 -0500

On Fri, Jan 9, 2015, at 17:48, FRIGN wrote:
> Did you read what I said? I explicitly went away from POSIX in this
> regard,
> because no human would write ""tr '\303\266o' 'o\303\266'".

POSIX doesn't require people to write it, it just requires that it
works. POSIX has no problem with also allowing a literally typed
multibyte character to refer to itself. It's basically saying that if
someone _does_ write '\303\266o' 'o\303\266', you have to treat it the
same as öo oö, and not as the individual bytes.

> The reason why POSIX prohibits collating elements is only because they
> are
> inhibited by their own overload of different character sets and locales.
> Given assuming a UTF-8-locale is a very sane way to go (see Plan 9), this
> limit can easily be thrown off and makes life easier.

I don't think you're understanding the difference between
multi-character collating elements and multibyte characters.

Multi-character collating elements are things like "ch" in some Spanish
locales. They have nothing to do with UTF-8.
Received on Fri Jan 09 2015 - 23:55:04 CET

This archive was generated by hypermail 2.3.0 : Sat Jan 10 2015 - 00:00:28 CET