Re: [hackers] [PATCH sbase] libutil/recurse: Split into two functions

From: Michael Forney <mforney_AT_mforney.org>
Date: Tue, 23 Jun 2020 11:26:27 -0700

Thanks for testing this out, Richard.

On 2020-06-23, Richard Ipsum <richardipsum_AT_vx21.xyz> wrote:
> I don't feel qualified to criticise the overall design, but do we not still
> need a way to specify whether traversal should be pre-order or post-order?
> I figure this is what DIRFIRST was for right?

The effective traversal order can be controlled by structuring your
recursor function to call recursedir() at the beginning or end.

DIRFIRST was only useful because a top-level directory was handled
specially by recurse. Normally, recurse() does not "visit" its
argument at all, only its children. For D/f, the call structure was
something like this for DIRFIRST:

        recurse(D)
                stat(D)
                add to history
                r->fn(D)
                        do something with D
                        recurse(D)
                                stat(D)
                                prune since we've already seen D
                        ...
                r->fn(D/entry1)
                        do something with D/entry1
                        recurse(D/entry1)
                                ...
                        ...
                r->fn(D/entry2)
                        do something with D/entry2
                        recurse(D/entry2)
                                ...
                        ...

For not DIRFIRST, it is

        recurse(D)
                stat(D)
                add to history
                r->fn(D/entry1)
                        recurse(D/entry1)
                                ...
                        ...
                        do something with D/entry1
                r->fn(D/entry2)
                        recurse(D/entry2)
                                ...
                        ...
                        do something with D/entry2

                r->fn(D)
                        do something with D
                        recurse(D)
                                stat(D)
                                prune since we've already seen D
                        ...
                                
As you can see, the DIRFIRST flag is only used at the toplevel for
bootstrapping. In other cases, it is ignored, since the function is
just structured differently depending on whether it needs the children
processed first or last:

        fn(D)
                do something with D
                recurse(D)

or

        fn(D)
                recurse(D)
                do something with D

>> - if (dirfd == AT_FDCWD)
>> - pathlen = estrlcpy(r->path, name, sizeof(r->path));
>
> Now that we no longer do this, r->path is not being reset between
> separate calls, so we have:
>
> % mkdir D
> % echo 'hello world' > f
> % du D f
> 4 D
> 4 f
> % ~/sbase/du D f
> 4 D
> 4 D

Ah, good catch. In recurse(), I still do the initialization, but I
changed the condition to `if (!r->pathlen)` thinking that would be
slightly better than `if (dirfd == AT_FDCWD)` in case the caller were
to pre-initialize r->path itself and pass a special directory FD. But,
nothing actually does this so it's not worth worrying about.

I think a better solution is just unconditionally initialize r->path
in recurse().

-Michael
Received on Tue Jun 23 2020 - 20:26:27 CEST

This archive was generated by hypermail 2.3.0 : Tue Jun 23 2020 - 20:36:35 CEST