Re: [dev] [sbase] [tar] some errors

From: willy <willy_AT_mailoo.org>
Date: Fri, 3 Feb 2017 14:28:10 +0100

Michael Forney wrote:
> On 12/24/16, Cág <caoc_AT_riseup.net> wrote:
> > Markus Wichmann wrote:
> >
> >> Well, that looks like it might be problematic, doesn't it? Especially
> >> when you find out, that the size of h->name there is 100 bytes. path
> >> contains, of course, the entire file path relative to the starting
> >> directory. In short, you will get this error message whenever trying to
> >> package files with a total relative path length of more than 100
> >> characters.
> >
> > Indeed, I've just tried to compress an extracted Linux kernel
> > (that doesn't have .git folder), it went without errors. Thanks for
> > pointing out.
> >
> > But when I tried to extract it, it still said "malformed tar archive".
> > Here's the part with it: http://git.suckless.org/sbase/tree/tar.c#n404
>
> Fixing up tar bugs in sbase has been on my TODO list for a while. It's
> the one tool I'm not using from sbase.
>
> One thing that might be related is that in various places, it uses
> eread(..., BLKSIZE), expecting that exactly BLKSIZE bytes are written
> (skipblk, xt, unarchive). But, if you are extracting from a pipe
> hooked up to a decompression program, it may be less than that. When
> this happens, it calls chktar on a random piece of data from the
> archive, which fails the checksum check.
>
> I think we should add a readall function to libutil, similar to the
> writeall function I sent in a patch set a few weeks ago.
>
> I think I have a pending patch to make the "malformed tar archive"
> errors more specific, but it's on a different computer and I am
> visiting family for the holidays. If you want to try and debug the
> error, I'd start with trying to figure out why it thinks it is
> malformed. I suspect it is due to a bad checksum.

Bumping this thread because I've been testing it a bit. I attached a
patch to make the error more descriptive, and this patch shows that the
most common error (to me) is a bad magic and that is certainly because the
magic is checked before the checksum. Sometimes the magic check pass,
but it then fails on the checksum.

This is indeed due to a short read from a pipe, because if you try to
print the said "bad magic", you end up with a chunk of the previously
extracted file. I will look into this "readall" function to submit
a patch.

Something worth noticing is that I only encountered this bug with bzip2
when it's compiled against musl. I couldn't reproduce it with glibc.

Received on Fri Feb 03 2017 - 14:28:10 CET

This archive was generated by hypermail 2.3.0 : Fri Feb 03 2017 - 14:36:14 CET