Re: [hackers] [sbase] [PATCH] tr: Fix multiple ranges with different lengths
On 10/22/16, Evan Gates <evan.gates_AT_gmail.com> wrote:
> On Oct 22, 2016 02:41, "Michael Forney" <mforney_AT_mforney.org> wrote:
>> This also fixes range expressions in the form [a-z], which get encoded as
> four
>> ranges '[', 'a'..'z', ']', causing all a-z characters to get mapped to
> ']'. This
>> form is occasionally used in shell scripts, including the syscalltbl.sh
> script
>> used to build linux.
>
> Can you provide an example of what you mean? Brackets are not special
> unless they are used for character classes or equivalence classes. An
> argument of the form '[a-z]' means a set containing characters left
> bracket, a to z, right bracket, and almost certainly the author meant just
> 'a-z'. Argument '[:lower:]' is a character class. There is often a
> misconception that tr takes opening and closing brackets due to their use
> in regex, but the arguments to tr are supposed to only be what would go
> inside those brackets in the regex, the brackets themselves should be
> omitted.
POSIX explains this:
"""
On historical System V systems, a range expression requires enclosing
square-brackets, such as:
tr '[a-z]' '[A-Z]'
However, BSD-based systems did not require the brackets, and this
convention is used here to avoid breaking large numbers of BSD
scripts:
tr a-z A-Z
The preceding System V script will continue to work because the
brackets, treated as regular characters, are translated to themselves.
However, any System V script that relied on "a-z" representing the
three characters 'a', '-', and 'z' have to be rewritten as "az-".
"""
The fact that the brackets are not special when they aren't used for a
character or equivalance class is precisely what allows the legacy
uses to continue to work.
The bracket form is also specifically mentioned in a tr.c comment.
Received on Sat Oct 22 2016 - 20:16:13 CEST
This archive was generated by hypermail 2.3.0
: Sat Oct 22 2016 - 20:24:14 CEST