Re: [hackers] [sbase][PATCH] Minor optimizations for 'yes'

From: aidanwillie0317 <aidanwillie0317_AT_protonmail.com>
Date: Wed, 26 Jun 2019 09:14:15 +0000

On Wednesday, June 26, 2019 4:04:50 AM EDT you wrote:
> What's the motivation for this? Are you running into performance
> issues with yes(1), or is this just for fun to increase the throughput
> for some benchmark?
One of yes(1)'s uses (besides the intended use of making interactive programs
work noninteractively) is to generate a high volume stream of data, as /dev/
urandom will only go up around 10MiB/s on most devices. I was doing some
testing for a personal program to see how well it could handle an influx of
data, and I noticed that the sbase yes is much slower than the GNU and FreeBSD
implementations. So, I patched the sbase one quickly and got it to run faster
than many implementations (GNU yes being a major exception, running at ~4GiB/s
on my machine).

> I'd be in favor of changing yes if it increases readability (while
> terse, the existing logic is a bit hard to follow), but I don't think
> this patch does that.
I agree, and this patch doesn't do that, no.
> On 2019-06-25, AGitBoy <aidanwillie0317_AT_protonmail.com> wrote:
> > ---
> >
> > yes.c | 17 ++++++++++++-----
> > 1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/yes.c b/yes.c
> > index dd97ea6..49f5ed6 100644
> > --- a/yes.c
> > +++ b/yes.c
> > _AT_@ -6,13 +6,20 @@
> >
> > int
> > main(int argc, char *argv[])
> > {
> >
> > - char **p;
> > + if (argc > 1) {
> > + char **p;
> >
> > - argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
> > + argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
> >
> > - for (p = argv; ; p = (*p && *(p + 1)) ? p + 1 : argv) {
> > - fputs(*p ? *p : "y", stdout);
> > - putchar((!*p || !*(p + 1)) ? '\n' : ' ');
> > + for (p = argv; ; p = (*p && *(p + 1)) ? p + 1 : argv) {
> > + fputs(*p, stdout);
> > + putchar((!*p || !*(p + 1)) ? '\n' : ' ');
> > + }
>
> In this case, since we know there is at least one argument, *p is
> always true, and I think some simplifications fall out of that.
As you said before, the code is hard to follow, so I didn't mess with that
part that much. However, in retrospect, the code can be simplfied to this:
                for (p = argv; ; p = *(p + 1) ? p + 1 : argv) {
                        fputs(*p, stdout);
                        putchar(!*(p + 1) ? '\n' : ' ');
                }

> > + } else {
> > + while (1) {
> > + fputc('y', stdout);
> > + fputc('\n', stdout);
> > + }
>
> I think I'd prefer
>
> for (;;)
> puts("y");
>
> here.
Originally, I had a puts(3) call there, but this decreased the performance
from the unpatched sbase, so I changed it to some fputc calls. Calling
write(2) didn't help either. I researched why and it's due to some buffering,
but adding buffering would overcomplicate the code.
> > }
> >
> > return 1; /* not reached */
> >
> > --
> > 2.21.0
I applied some of your suggestions. The updated diff is below, and some of the code looks much less obfuscated now, as an added benefit.
---
 yes.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/yes.c b/yes.c
index dd97ea6..e1fb4bd 100644
--- a/yes.c
+++ b/yes.c
_AT_@ -6,13 +6,20 @@
 int
 main(int argc, char *argv[])
 {
-       char **p;
+       if (argc > 1) {
+               char **p;
-       argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+               argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
-       for (p = argv; ; p = (*p && *(p + 1)) ? p + 1 : argv) {
-               fputs(*p ? *p : "y", stdout);
-               putchar((!*p || !*(p + 1)) ? '\n' : ' ');
+               for (p = argv; ; p = *(p + 1) ? p + 1 : argv) {
+                       fputs(*p, stdout);
+                       putchar(!*(p + 1) ? '\n' : ' ');
+               }
+       } else {
+               for (;;) {
+                       fputc('y', stdout);
+                       fputc('\n', stdout);
+               }
        }
        return 1; /* not reached */
--
2.21.0
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, June 26, 2019 4:04 AM, Michael Forney <mforney_AT_mforney.org> wrote:
> What's the motivation for this? Are you running into performance
> issues with yes(1), or is this just for fun to increase the throughput
> for some benchmark?
>
> I'd be in favor of changing yes if it increases readability (while
> terse, the existing logic is a bit hard to follow), but I don't think
> this patch does that.
>
> On 2019-06-25, AGitBoy aidanwillie0317_AT_protonmail.com wrote:
>
> > yes.c | 17 ++++++++++++-----
> > 1 file changed, 12 insertions(+), 5 deletions(-)
> > diff --git a/yes.c b/yes.c
> > index dd97ea6..49f5ed6 100644
> > --- a/yes.c
> > +++ b/yes.c
> > _AT_@ -6,13 +6,20 @@
> > int
> > main(int argc, char *argv[])
> > {
> >
> > -   char **p;
> >
> > -   if (argc > 1) {
> > -       char **p;
> >
> >
> >
> > -   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
> >
> > -       argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
> >
> >
> >
> > -   for (p = argv; ; p = (*p && *(p + 1)) ? p + 1 : argv) {
> > -       fputs(*p ? *p : "y", stdout);
> >
> >
> > -       putchar((!*p || !*(p + 1)) ? '\\n' : ' ');
> >
> >
> >
> > -       for (p = argv; ; p = (*p && *(p + 1)) ? p + 1 : argv) {
> >
> >
> > -       	fputs(*p, stdout);
> >
> >
> > -       	putchar((!*p || !*(p + 1)) ? '\\n' : ' ');
> >
> >
> > -       }
> >
> >
>
> In this case, since we know there is at least one argument, *p is
> always true, and I think some simplifications fall out of that.
>
> > -   } else {
> > -       while (1) {
> >
> >
> > -       	fputc('y', stdout);
> >
> >
> > -       	fputc('\\n', stdout);
> >
> >
> > -       }
> >
> >
>
> I think I'd prefer
>
> for (;;)
> puts("y");
>
> here.
>
> > }
> >
> > return 1; /* not reached */
> >
> > ----------------------------
> >
> > 2.21.0
Received on Wed Jun 26 2019 - 11:14:15 CEST

This archive was generated by hypermail 2.3.0 : Wed Jun 26 2019 - 11:24:23 CEST