On Feb 4, 2010, at 11:17 PM, Joel Reymont wrote:
On Feb 5, 2010, at 6:19 AM, Michael Smith wrote:
Actually, no. Reading from an F_NOCACHE file into a properly aligned, allocated block will be significantly more efficient than using mmap.
How so?
Because you're not taking pagefaults incrementally as you touch pages in the block. One system call, maximally-sized I/O -> minimal overhead.
mmap() is largely evil, and should be avoided wherever possible.
Can you please elaborate?
There are a bunch of reasons. The worst of them, IMO:
• There is no error handling mechanism. If the volume containing the file goes away (removable drive, network share, etc.) then you're typically going to be terminated as soon as you touch any page in the mapped range. At the very least, you're going to have to work out how to catch the exception, decode it to work out which mapped region you took a fault on, and then deal with the fact that any unique data you had in the mapped region is gone.
• It's not deterministic; any time you touch a page in the range, you can stall waiting for I/O.
• Because I/O for mapped ranges is reactive, rather than proactive, you have limited options for hinting to the system what your future access patterns are going to be like. If your behaviour doesn't line up with the VM's guesses about activity - and remember that they are tuned for generic workloads - you can find yourself in pathalogical corners.
• It's not transactional, meaning that maintaining coherency in the file vs. sudden termination, application bugs, etc. is very difficult.
That'll do for a start. 8)
The bottom line is that unless you are *certain* that you *need* to map a file, you're generally going to be better off not doing it.
= Mike
--