Re: [dev] coreutils / moreutils - DC a directory counter

From: Markus Wichmann <nullplan_AT_gmx.net>
Date: Thu, 18 Jul 2013 17:33:32 +0200

On Wed, Jul 17, 2013 at 01:23:33PM -0400, Calvin Morrison wrote:
> Hi guys,
>
> I came up with a utility[0] that i think could be useful, and I sent
> it to the moreutils page, but maybe it might fit better here. All it
> does is give a count of files in a directory.
>
> I was sick of ls | wc -l being so damned slow on large directories, so
> I thought a more direct solution would be better.
>
> calvin_AT_ecoli:~/big_folder> time ls file2v1dir/ | wc -l
> 687560
>
> real 0m7.798s
> user 0m7.317s
> sys 0m0.700s
>
> calvin_AT_ecoli:~/big_folder> time ~/bin/dc file2v1dir/
> 687560
>
> real 0m0.138s
> user 0m0.057s
> sys 0m0.081s
>
> What do you think?
> Calvin
>
> [0] https://github.com/mutantturkey/dc

Some comments on the utility: I don't think it is very useful as I
usually don't deal with directories that big. That's part of what
directories are for. If you can honestly have more than 100k files in
the same directory without being able to further divide them into
directories, maybe you want another directory structure.

Apart from that, ls takes a long time to complete because it is also
calling stat() on every directory entry (having had a look at sbase's
ls: You actually only need stat() if -l or -t is set. GNU ls of course
has a whole other ton of features, so it needs to call stat basically
always)

On the code: You can drop the whole getcwd() by just opening "." if no
other place is given. Also, there already is a constant for the maximum
path length of a file. It's called PATH_MAX and already includes the NUL
byte.

Ciao,
Markus
Received on Thu Jul 18 2013 - 17:33:32 CEST

This archive was generated by hypermail 2.3.0 : Thu Jul 18 2013 - 17:36:11 CEST