Re: [hackers] [PATCH][sbase] tar: use bigger buffer size to increase performance

From: Michael Forney <mforney_AT_mforney.org>
Date: Sat, 9 Sep 2017 15:03:43 -0700

Make sure to CC Jim in replies; he is not subscribed to the list.

On Sat, Sep 9, 2017 at 1:31 PM, Jim Beveridge <jimbe_AT_google.com> wrote:
> Several comments to clear up some points about my tar code:

Thanks, Jim. I still have a few questions.

I also have to insist that these issues are tackled one-by-one rather
in one large commit.

> The CHUNK changes were not intended to optimize the general case, that was
> just a happy side effect. The changes were intended to address severe
> performance problems on platforms (like Fuchsia, at least at the moment)
> that don't do write-behind and/or read-ahead caching.

Yes, this needs to be addressed, however, I think we should do this
with a copy buffer size define in util.h, since libutil/concat.c
should use a larger buffer as well.

> My code supports short reads. I specifically had a test bed that was capable
> of reproducing short reads from a pipe and I spent quite a bit of time
> making that case work properly. Perhaps I'm reading too much into your
> comments, but I don't see the point of waiting on support for short write
> support to commit these changes supporting short reads.

Yes, but I think adding a utility function to deal with the case of
reading an entire amount (like plan9's readn or go's io.ReadFull)
would simplify this.

> Regarding your comment about, "If you tar a file within another directory,
> you end up with just one entry. This check means that the parent directory
> won't be created when trying to extract this file. Other tar implementations
> are able to extract such an archive." I completely agree, but I'm not sure
> what's driving your comment? My implementation supports this case.

Sorry, I should have tested and provided an example. You're right,
that case is supported by your patch. However, the following case is
not:

$ mkdir -p a/b
$ tar -cf a.tar a/b
$ rm -r a
$ tar -xf a.tar
tar: mkdir a/b: No such file or directory

> The
> previous implementation had issues because it never allowed the directory
> creation switch case to do its job, which caused loss of permissions on
> directory creation and unnecessary user-facing warnings in some cases.

Can you provide a concrete example of one of these cases? As far as I
understand it, if all the directories are present in the archive, they
will be created by the directory creation switch case with the
appropriate permissions. mkdirp should be harmless on a parent
directory that already exists since it never removes directories or
changes their mode.

Which warnings are you talking about? mkdirp should not warn if the
call to mkdir fails with EEXIST, and since we call it with the dirname
of fname, it should not be creating directories for the tar entry
itself.

I also wonder if this recent commit solves your problem, but even if
it does, I'm curious about the behavior you were observing:

https://git.suckless.org/sbase/commit/libutil/mkdirp.c?id=6ac5f01cc94b2a6f7b6406ddd151e7b4d8fb1d7d

> Finally, st should not be added into the if clause. That was a missing merge
> in my code. I fixed this in my pull request to be:
> if (r->depth && S_ISDIR(st->st_mode))

Okay, thanks for confirming.
Received on Sun Sep 10 2017 - 00:03:43 CEST

This archive was generated by hypermail 2.3.0 : Sun Sep 10 2017 - 00:12:21 CEST