Re: [hackers] [sbase][PATCH] Support -- in all utilities except echo(1)

From: Laslo Hunhold <dev_AT_frign.de>
Date: Mon, 1 Jul 2019 08:15:02 +0200

On Sun, 30 Jun 2019 21:20:43 -0700
Michael Forney <mforney_AT_mforney.org> wrote:

Dear Michael,

> I'm okay with switching to getopt(3), but also note that the current
> arg.h is probably more complicated than it needs to be. Here's a
> version I rewrote that I've been using in my own projects:
>
> https://git.sr.ht/~mcf/samurai/blob/master/arg.h
>
> I agree that getopt(3) would probably be better at handling the corner
> cases. Yesterday I was planning to check that tools behaved correctly
> with argv = { NULL }. The first thing I tried was rm(1), which tried
> to remove files corresponding to my environment (i.e.
> "SHELL=/bin/ksh"). Yikes!
>
> I am also not sure how getopt(3) could be used to handle the tricky
> cases I mentioned, like printf(1). It doesn't have any options, but
> still needs to support `--` and treat arguments that start with `-` as
> operands rather than printing a usage message.

as an addendum, my own research has yielded some points against getopt.

 - before returning '?' on an invalid option, it prints a bloody error
   message. You will have to set the global variable "opterr" to 0
   before calling getopt to prevent stupid error messages from flooding
   stderr.
 - The behaviour of returning ':' or '?' for a missing argument is a bit
   convoluted. Read the Posix-page to know what I mean, but in short it
   only returns ':' at all if the first character of the optstring
   is ':'. So we can get rid of the distinction and simplify the switch
   a bit; see below.
 - getopt(3) is not thread-safe, as it is not reentrant and difficult
   to reset, as this is undefined behaviour as modifying the global
   state is undefined behaviour.
 - One aspect I totally missed is that argc won't be modified either.
   Instead, getopt() sets a global variable "optind" corresponding to
   the first non-option argument. So the port I made in the previous
   mail would actually have to look like below.

        int c;

        opterr = 0;
        while ((c = getopt(argc, argv, "abc:")) != -1) {
                switch (c) {
                case 'a':
                        aflag = 1;
                        break;
                case 'b':
                        bflag = 1;
                        break;
                case 'c':
                        cflag = 1;
                        carg = optarg;
                        break;
                default:
                        usage();
                }
        }
        argv += optind;
        argc -= optind;

This should 1:1 correspond to the arg.h-usage

        ARGBEGIN {
        case 'a':
                aflag = 1;
                break;
        case 'b':
                bflag = 1;
                break;
        case 'c':
                cflag = 1;
                carg = EARGF(usage());
        default:
                usage();
        } ARGEND

What we can really notice is that due to the amalgamation of the cases
'?' and ':' into "default" the switches are nearly identical, modulo
the uses of EARGF(). It would thus be possible to redefine the ARGBEGIN-
and ARGEND-macros in terms of getopt(3). To "contain" the local
variable 'c', instead of a block, I would wrap it all into a

        do {int c; ...} while (false)

so c has a local context and the macro is relatively safe. ARGEND would
then just be a

        } while (false)

What do the others think?

With best regards

Laslo

-- 
Laslo Hunhold <dev_AT_frign.de>

Received on Mon Jul 01 2019 - 08:15:02 CEST

This archive was generated by hypermail 2.3.0 : Mon Jul 01 2019 - 08:36:22 CEST