Re: [dev] Miscellaneous sbase issues

From: Dimitris Papastamos <>
Date: Mon, 27 Apr 2015 11:02:56 +0100

On Sun, Apr 26, 2015 at 06:24:18PM -0700, Michael Forney wrote:
> tar
> ---
> Since fb1595a69c091a6f6a9303b1fab19360b876d114, tar calls remove(3) on
> directories before extracting them. I'm not sure that it is reasonable
> for tar to do this because users may want to re-extract archives, or
> extract archives on top a directory structure that already exists.
> Additionally, it is fairly common to find tar archives containing the
> "." directory (possibly with a trailing '/'), which were constructed
> using "tar -cf foo.tar .".

Yeah that makes sense I suppose. Some things that need to be done
for tar:

- Investigate aforementioned remove vs unlink issue.
- When we tar a file, we need to ensure to use both name/prefix if the
filename is more than 100 chars.
- Strip leading / from filenames and dangerous things like ../../ etc.

> cat, tee
> --------
> These utilities read from stdin using fread(3) into a buffer of size
> BUFSIZ. However, fread will read until it fills up the entire buffer (or
> hits EOF) before returning, causing noticeable delay when the input
> comes from other programs or scripts.
> To demonstrate this problem, compare the output of these commands:
> for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done
> for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done | cat
> for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done | tee
> I considered fixing this by making the concat function take an fd
> instead and make a single call to read(2), but this causes problems for
> sponge, which uses a FILE * obtained from tmpfile(3) as both output and
> input for concat. We could also use mkstemp(3) to return a file
> descriptor, and use a FILE * from fdopen for writing, and the file
> descriptor for reading, but this seems unclean to me.

We should avoid mixing file stream I/O and raw I/O.
Check out '2.5.1 Interaction of File Descriptors and Standard I/O Streams'.

> Another option would be to use fgetc and fputc for the concat
> implementation, and let libc take care of the buffering. I'm not sure if
> this has any performance implications.

Sounds about right.
Received on Mon Apr 27 2015 - 12:02:56 CEST

This archive was generated by hypermail 2.3.0 : Mon Apr 27 2015 - 12:12:07 CEST