On Sat, Jul 10, 2004 at 10:11:40AM +0400, Nikita Danilov wrote:
what are the reasons of XNU kernel using separate buffer cache (storing
struct nfsbuf's) for NFS? It seems that all buffer cache operations are
going through file system VOP_STRATEGY() at which point NFS could
initiate networking IO.
There should be some important reason I am missing, because in the
current state of affairs there is significant code duplication (hashing,
async IO daemon, whole nfs_bio.c), NFS doesn't enjoys advantages of
vfs_cluster.c, etc.
The NFS buffer cache code was added in 10.3. Before that (10.2 and earlier) NFS used the standard buffer cache. So, if you'd like to see what the code looked like when NFS did use the standard buffer cache, check out the 10.2 code. The primary reasons for NFS using its own buffer cache are multi-page buffer support and better unstable write support: * The standard buffer cache code doesn't handle multi-page buffers well. struct buf has only a single bit (B_WASDIRTY) to reflect the clean/dirty status of the buffer/page in the VM. In order to increase NFS I/O performance in 10.3, we needed to be able to increase the sizes of the NFS I/O requests and also the buffers. With only one bit to reflect the clean/dirty status of the buffer in the VM, you can't (correctly) handle multi-page buffers being mmap()ed. * The standard buffer cache and cluster code don't understand NFSv3's notion of unstable writes. The NFS client can ask the NFS server to respond to a write immediately without waiting for the data to make it to stable storage. Later the NFS client can make one "commit" request to force all that data to stable storage (it isn't there already). During the time between the completion of the unstable write and the commit, the buffer on the NFS client needs to be marked as "needs commit" which essentially means that this data was written, so we should only need to send a commit later, but if the server goes down we may also need to write that data to the server again. Before 10.3, NFS performance suffered whenever the bcleanbuf thread would try to clean up a bunch of "dirty" buffers. bcleanbuf would pick out a single buffer to try to clean and issue a write for it. The NFS code would see that the buffer only needed to be committed so it would simply send a commit request for that buffer. Whenever the system would run low on buffers, this would cause the NFS code to send a single commit request for each buffer instead of allowing the code to coalesce the "needs commit" buffers into one commit request. Hope that helps. --macko _______________________________________________ darwin-kernel mailing list | darwin-kernel@lists.apple.com Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel Do not post admin requests to the list. They will be ignored.