Re: [dev] Re: coreutils / moreutils - DC a directory counter

From: Calvin Morrison <mutantturkey_AT_gmail.com>
Date: Wed, 17 Jul 2013 17:02:45 -0400

On 17 July 2013 16:58, Christian Neukirchen <chneukirchen_AT_gmail.com> wrote:
> Calvin Morrison <mutantturkey_AT_gmail.com> writes:
>
>> On 17 July 2013 16:32, Christian Neukirchen <chneukirchen_AT_gmail.com> wrote:
>>> Calvin Morrison <mutantturkey_AT_gmail.com> writes:
>>>
>>>> Hi guys,
>>>>
>>>> I came up with a utility[0] that i think could be useful, and I sent
>>>> it to the moreutils page, but maybe it might fit better here. All it
>>>> does is give a count of files in a directory.
>>>>
>>>> I was sick of ls | wc -l being so damned slow on large directories, so
>>>> I thought a more direct solution would be better.
>>>>
>>>> calvin_AT_ecoli:~/big_folder> time ls file2v1dir/ | wc -l
>>>> 687560
>>>>
>>>> real 0m7.798s
>>>> user 0m7.317s
>>>> sys 0m0.700s
>>>>
>>>> calvin_AT_ecoli:~/big_folder> time ~/bin/dc file2v1dir/
>>>> 687560
>>>>
>>>> real 0m0.138s
>>>> user 0m0.057s
>>>> sys 0m0.081s
>>>>
>>>> What do you think?
>>>> Calvin
>>>
>>> What's the bottle neck here?
>>
>> Looking up the filenames and reading them, printing them to standard
>> out and then wc parsing for all the \n characters.
>>
>>> (Or is your dc only faster because the directory index is in cache now...)
>>
>> No that's not why:
>>
>> calvin_AT_ecoli:~/big_folder> ls 2v1 | wc -l
>> 687560
>>
>> real 0m7.678s
>> user 0m7.313s
>> sys 0m0.579s
>>
>> calvin_AT_ecoli:~/big_folder> time dc 2v1
>> 687560
>>
>> real 0m0.138s
>> user 0m0.055s
>> sys 0m0.082s
>>
>> calvin_AT_ecoli:~/big_folder> time ls 2v1 | wc -l
>> 687560
>>
>> real 0m7.672s
>> user 0m7.310s
>> sys 0m0.580s
>
> How fast is find 2v1 -printf x | wc -c ?
>
> --
> Christian Neukirchen <chneukirchen_AT_gmail.com> http://chneukirchen.org
>
>

time find 2v1 -printf x | wc -c
687561

real 0m0.531s
user 0m0.264s
sys 0m0.271s


time ls 2v1 > /dev/null

real 0m7.642s
user 0m7.265s
sys 0m0.375s

So it seems a good deal of that time is ls
Received on Wed Jul 17 2013 - 23:02:45 CEST

This archive was generated by hypermail 2.3.0 : Wed Jul 17 2013 - 23:12:08 CEST