Re: mmap [was: 10.6.2 issues with F_NOCACHE?]
Re: mmap [was: 10.6.2 issues with F_NOCACHE?]
- Subject: Re: mmap [was: 10.6.2 issues with F_NOCACHE?]
- From: Jens Alfke <email@hidden>
- Date: Fri, 5 Feb 2010 08:33:07 -0800
On Feb 4, 2010, at 11:39 PM, Michael Smith wrote:
>>> Actually, no. Reading from an F_NOCACHE file into a properly aligned, allocated block will be significantly more efficient than using mmap.
>> How so?
> Because you're not taking pagefaults incrementally as you touch pages in the block. One system call, maximally-sized I/O -> minimal overhead.
Agreed. But to clarify, this is true if you need to access the entire file ASAP (and the file is small compared to the amount of RAM.) Which is the OP's situation, to be sure.
The different use case I'm interested in (and I know it's different, so I'm forking this thread) is more database-like: essentially random access to an arbitrarily-large file. In this case I think mmap makes more sense (module your arguments below.)
> • There is no error handling mechanism. If the volume containing the file goes away (removable drive, network share, etc.) then you're typically going to be terminated as soon as you touch any page in the mapped range.
Yes, this is a bitch. My rationalization has been that it's OK to mmap a file that's either on the boot volume or in the user's home directory, because if either of those goes away, you're basically hosed anyway.
> • It's not deterministic; any time you touch a page in the range, you can stall waiting for I/O.
True. But as you get higher up in layers of abstraction, determinism tends to evaporate anyway. For example, in CoreData (or any other object/database mapping) accessing any property of any persistent object might cause a db query involving I/O.
> • Because I/O for mapped ranges is reactive, rather than proactive, you have limited options for hinting to the system what your future access patterns are going to be like.
Also true. But in something like a database (unless it's an insanely-tightly-tuned one like Oracle that doesn't use a filesystem anyway) the access patterns are very difficult to predict at all. An index lookup involves a random walk through a chain of B-tree nodes that are each located at essentially-random places in the file.
> • It's not transactional, meaning that maintaining coherency in the file vs. sudden termination, application bugs, etc. is very difficult.
If I understand what you mean correctly, this is something that applies only to writeable mappings, which I agree are very hard to manage. (Although coherency is really hard to maintain even with traditional file I/O, viz. all the problems with database corruption in earlier versions of sqlite.)
—Jens _______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden