Apple

On Feb 5, 2010, at 8:33 AM, Jens Alfke wrote:

On Feb 4, 2010, at 11:39 PM, Michael Smith wrote:

Actually, no. Reading from an F_NOCACHE file into a properly aligned, allocated block will be significantly more efficient than using mmap.
How so?
Because you're not taking pagefaults incrementally as you touch pages in the block. One system call, maximally-sized I/O -> minimal overhead.

Agreed. But to clarify, this is true if you need to access the entire file ASAP (and the file is small compared to the amount of RAM.) Which is the OP's situation, to be sure.

It's not that simple. Even if you don't need to access the entire file, you are frequently in a much better position than the VM to know how much you need and when you will need it.

The different use case I'm interested in (and I know it's different, so I'm forking this thread) is more database-like: essentially random access to an arbitrarily-large file. In this case I think mmap makes more sense (module your arguments below.)

Only if you honestly believe that you can't cache the file better than the VM does.

• There is no error handling mechanism. If the volume containing the file goes away (removable drive, network share, etc.) then you're typically going to be terminated as soon as you touch any page in the mapped range.

Yes, this is a bitch. My rationalization has been that it's OK to mmap a file that's either on the boot volume or in the user's home directory, because if either of those goes away, you're basically hosed anyway.

Not a given, especially if you are writing into the file. You need to consider what happens if the problem is transient (e.g. the user reboots, reconnects, etc) - is the file going to be in a recoverable state?

• It's not deterministic; any time you touch a page in the range, you can stall waiting for I/O.

True. But as you get higher up in layers of abstraction, determinism tends to evaporate anyway. For example, in CoreData (or any other object/database mapping) accessing any property of any persistent object might cause a db query involving I/O.

Now you're changing the subject. Let's stick to file data, since that's what this conversation is about.

• Because I/O for mapped ranges is reactive, rather than proactive, you have limited options for hinting to the system what your future access patterns are going to be like.

Also true. But in something like a database (unless it's an insanely-tightly-tuned one like Oracle that doesn't use a filesystem anyway) the access patterns are very difficult to predict at all. An index lookup involves a random walk through a chain of B-tree nodes that are each located at essentially-random places in the file.

Again, it's a question of who knows what. In the case you describe here, the database has some pretty good ideas about its access patterns for file pages containing index nodes vs. those that contain row data, and that's something that it can do something about if it's implementing direct I/O, but not if it's just mapped the database.

• It's not transactional, meaning that maintaining coherency in the file vs. sudden termination, application bugs, etc. is very difficult.

If I understand what you mean correctly, this is something that applies only to writeable mappings, which I agree are very hard to manage. (Although coherency is really hard to maintain even with traditional file I/O, viz. all the problems with database corruption in earlier versions of sqlite.)

References:
	>Re: 10.6.2 issues with F_NOCACHE? (From: Michael Smith <email@hidden>)
	>Re: 10.6.2 issues with F_NOCACHE? (From: Joel Reymont <email@hidden>)
	>Re: 10.6.2 issues with F_NOCACHE? (From: Michael Smith <email@hidden>)
	>Re: mmap [was: 10.6.2 issues with F_NOCACHE?] (From: Jens Alfke <email@hidden>)