Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

From: FRIGN <dev_AT_frign.de>
Date: Sat, 10 Jan 2015 00:39:16 +0100

On Fri, 09 Jan 2015 18:24:46 -0500
random832_AT_fastmail.us wrote:

> Even if octal values could be more than three digits, I have no idea
> what you think 50102 is. Its decimal value is 20546. Its hex value is
> 0x5042. I have no idea what it has to do with character U+00F6 whose
> UTF-8 representation is 0xC3 0xB6..... I just realized what you're
> doing, 0xC3B6 has the _decimal_ value 50102, I have no idea why you
> would think _that_ is a representation people would want to use. If
> you're so pro-unicode, make it accept \u00F6 - that's a valid extension.
> But reusing the syntax POSIX uses for three-digit octal literals, for
> arbitrarily long decimal literals that aren't even unicode code points,
> makes no sense at all. In what universe is that intuitive?

C3B6 is 'ö' and makes sense to allow specifying it as \50102 (in the pure
UTF-8-sense of course, nothing to do with collating).

> Collating elements = POSIX forbids them = You don't want them anyway.
> Multibyte characters = POSIX allows/requires them = You like them too.
> What is the problem?
> I don't know what you want to do that you think POSIX doesn't allow.

Well, probably I misunderstood the matter. Sometimes this stuff gets
above my head. ;)
At the end of the day, you want software to work as expected:

GNU tr:
$ echo ελληνική | tr [α-ω] [Α-Ω]
®®®®®®®®®

our tr:
$ echo ελληνικη | ./tr [α-ω] [Α-Ω]
ΕΛΛΗΝΙΚΗ

Cheers

FRIGN

-- 
FRIGN <dev_AT_frign.de>
Received on Sat Jan 10 2015 - 00:39:16 CET

This archive was generated by hypermail 2.3.0 : Sat Jan 10 2015 - 00:48:07 CET