Re: [hackers] [sbase] [PATCH] tr: Fix multiple ranges with different lengths

From: Michael Forney <>
Date: Sat, 22 Oct 2016 11:16:13 -0700

On 10/22/16, Evan Gates <> wrote:
> On Oct 22, 2016 02:41, "Michael Forney" <> wrote:
>> This also fixes range expressions in the form [a-z], which get encoded as
> four
>> ranges '[', 'a'..'z', ']', causing all a-z characters to get mapped to
> ']'. This
>> form is occasionally used in shell scripts, including the
> script
>> used to build linux.
> Can you provide an example of what you mean? Brackets are not special
> unless they are used for character classes or equivalence classes. An
> argument of the form '[a-z]' means a set containing characters left
> bracket, a to z, right bracket, and almost certainly the author meant just
> 'a-z'. Argument '[:lower:]' is a character class. There is often a
> misconception that tr takes opening and closing brackets due to their use
> in regex, but the arguments to tr are supposed to only be what would go
> inside those brackets in the regex, the brackets themselves should be
> omitted.

POSIX explains this:

On historical System V systems, a range expression requires enclosing
square-brackets, such as:

  tr '[a-z]' '[A-Z]'

However, BSD-based systems did not require the brackets, and this
convention is used here to avoid breaking large numbers of BSD

  tr a-z A-Z

The preceding System V script will continue to work because the
brackets, treated as regular characters, are translated to themselves.
However, any System V script that relied on "a-z" representing the
three characters 'a', '-', and 'z' have to be rewritten as "az-".

The fact that the brackets are not special when they aren't used for a
character or equivalance class is precisely what allows the legacy
uses to continue to work.

The bracket form is also specifically mentioned in a tr.c comment.
Received on Sat Oct 22 2016 - 20:16:13 CEST

This archive was generated by hypermail 2.3.0 : Sat Oct 22 2016 - 20:24:14 CEST