Re: [dev] coreutils / moreutils - DC a directory counter

From: Calvin Morrison <mutantturkey_AT_gmail.com>
Date: Thu, 18 Jul 2013 12:13:36 -0400

On 18 July 2013 11:33, Markus Wichmann <nullplan_AT_gmx.net> wrote:
> On Wed, Jul 17, 2013 at 01:23:33PM -0400, Calvin Morrison wrote:
>> Hi guys,
>>
>> I came up with a utility[0] that i think could be useful, and I sent
>> it to the moreutils page, but maybe it might fit better here. All it
>> does is give a count of files in a directory.
>>
>> I was sick of ls | wc -l being so damned slow on large directories, so
>> I thought a more direct solution would be better.
>>
>> calvin_AT_ecoli:~/big_folder> time ls file2v1dir/ | wc -l
>> 687560
>>
>> real 0m7.798s
>> user 0m7.317s
>> sys 0m0.700s
>>
>> calvin_AT_ecoli:~/big_folder> time ~/bin/dc file2v1dir/
>> 687560
>>
>> real 0m0.138s
>> user 0m0.057s
>> sys 0m0.081s
>>
>> What do you think?
>> Calvin
>>
>> [0] https://github.com/mutantturkey/dc
>
> Some comments on the utility: I don't think it is very useful as I
> usually don't deal with directories that big. That's part of what
> directories are for. If you can honestly have more than 100k files in
> the same directory without being able to further divide them into
> directories, maybe you want another directory structure.
>
> Apart from that, ls takes a long time to complete because it is also
> calling stat() on every directory entry (having had a look at sbase's
> ls: You actually only need stat() if -l or -t is set. GNU ls of course
> has a whole other ton of features, so it needs to call stat basically
> always)
>
> On the code: You can drop the whole getcwd() by just opening "." if no
> other place is given. Also, there already is a constant for the maximum
> path length of a file. It's called PATH_MAX and already includes the NUL
> byte.

Seems like a much simpler solution! I don't know why I didn't think of that.

> Ciao,
> Markus
>
Received on Thu Jul 18 2013 - 18:13:36 CEST

This archive was generated by hypermail 2.3.0 : Thu Jul 18 2013 - 18:24:06 CEST