On Sat, Jun 12, 2010 at 03:18:57PM +0100, David Tweed wrote:
>I just know I'm going to regret getting involved in this but...
Probably not. You seem reasonable. I only flame trolls.
>My understanding is that on Linux at least, reading causes the data to
>be moved into the kernel's page cache (which I believe has a page
>level granularity even if you "read only a byte"), and then a copy is
>made from the page cache into the processes memory space. Mmapping it
>means your process gets the page cache page mapped into its address
>space, so the data is only in memory once rather than an average of
>1.x times where x depends on pagecache discard policy. So IF you are
>genuinely moving unpredictably around accessing a truly huge file,
>mmapping it means that you can fit more of it in memory rather than
>having both your program and the page cache trying to figure out which
>bits to discard in an attempt to keep memory usage down. This effect
>is actually much more important with huge files than smaller files
>where the page cache duplication doesn't have as much effect on system
>memory usage as a whole.
You may be right. I don't know very much about Linux's buffer
cache. On the other hand, even so, I'd consider read the better
option in most use cases I can think of. There are probably
cases where mmap would be more efficient, but I rather expect
that the gains in efficiency depend on the programmer knowing to
a fairly high degree of detail when and why. It doesn't mean
that mmap should be used instead of read wherever possible.
-- Kris Maglione It is a farce to call any being virtuous whose virtues do not result from the exercise of its own reason. --Mary WollstonecraftReceived on Sat Jun 12 2010 - 14:52:40 UTC
This archive was generated by hypermail 2.2.0 : Sat Jun 12 2010 - 15:00:04 UTC