Re: fsync while appending to a file
Re: fsync while appending to a file
- Subject: Re: fsync while appending to a file
- From: Terry Lambert <email@hidden>
- Date: Mon, 25 Oct 2010 16:27:20 -0700
On Oct 25, 2010, at 2:27 PM, Joel Reymont wrote:
> So how does this work when it works?
>
> How can you be writing to the buffers that you are trying to sync to disk?
>
> Thanks, Joel
If you were trying to modify data in the underlying page, it could be "busy" and delay your writing, or it could be being written and delay your fsync(). Data is either committed or it isn't, depending on who gets in first, and you take your chances. Unless you do advisory file locking, or the descriptor in the other process was passed via UNIX domain socket or inherited as a result of a fork()/posix_spawn() (in which case they will share fileglobs, offset pointers, and so on, but maybe still not lock against each other). You could also be avoiding the buffer cache entirely, using direct I/O (F_NOCACHE; there are odd rules about already cached data not actually avoiding the cache in this case).
Otherwise, your fd's are distinct and the fileproc that the fd points to has a separate fileglob, but a shared vnode. The fg_data in the fileglob, when it refers to a vnode, will point to the same vnode for every open instance of the same file (the file system will always vend the same vnode, if there is an existing reference already). For HFS, the vnode's v_data will point to the same cnode, and as I previously noted, it will ensure that the hfs_lock() is held over the call.
Again, this is a property of HFS, and other file systems might not make as strong guarantees (HFS might not always act this way, either), so portable code should not depend on this behaviour.
In any case, since you are racing one process into the write() for the append, and racing the other process into the fsync(), which one wins the race is going to determine if the new (dirty) pages have been attached to the file or not yet. Any dirty pages that are attached to the file before the fsync() will be committed to disk (cache), and the fsync() will not return until they are done being committed. If you need them committed to stable storage, you'd need to do an FUA to the drive, which you can do uding an fcntl(fd, F_FULLFSYNC), which is considerably expensive.
Either way, unless you do locking, you are still racing to see who gets there first, and as I said previously, you shouldn't assume that the fsync() call is a serialization barrier for the write() call and vice versa, since they can race. That's FS implementation defined. Also for very large writes, the vn_rdwr() can loop breaking the write() operation up into smaller chunks in any case, so there's no strict guarantee of atomicity, only idempotence for the overall write(). In simple terms, if it broke the write up into three I/O's, then you could fsync() when anywhere from 0..3 of them had been written, depending on how far along in the loop it was when the other process raced in and fsync()'ed.
-- Terry _______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden