Re: FSSetForkSize writing massive amounts of data over network?
Re: FSSetForkSize writing massive amounts of data over network?
- Subject: Re: FSSetForkSize writing massive amounts of data over network?
- From: James Bucanek <email@hidden>
- Date: Wed, 2 Feb 2011 22:39:40 -0700
Mark Day <mailto:email@hidden> wrote (Wednesday, February 2,
2011 5:09 PM -0800):
On Feb 2, 2011, at 3:36 PM, James Bucanek wrote:
I thought it might be an unusual file system or server, but they assure me
that it's a PowerMac and Mini, both running 10.6, over
ethernet. The Mini is connected via Firewire to a IOMega RAID
(0) formatted
HFS+.
It may have something to do with the network protocol being used (especially
AFP vs. NFS or SMB or other).
This AFP: OS X 10.6 client to OS X 10.6 file sharing.
I confirmed that this is happening with a similar configuration here, which
really surprises me. On local volumes, an FSSetForkSize is as
fast as a delete, and simply assigns free blocks to the file.
Sort of. If you then closed the fork on the local volume (HFS), the close
would take a long time as the kernel was busy writing out the gigabytes of
zeroes.
Interestingly, I seem to have dodged that bullet. If my
"reserve" file isn't needed, its EOF is set back to 0 and then
closed, so the it's never closed with some huge EOF and
therefore never gets filled.
On Mac OS X, HFS follows the POSIX behavior of requiring that all unwritten
ranges of a file must return zeroes when read back.
That's news to me. I guess the documentation is long overdue for
an update (I'll file a bug report):
From the FSSetForkSize documentation:
Discussion
... If the fork’s new size is greater than the fork’s
current size, then the
additional bytes, between the old and new size, will have
an undetermined value.
This lead me to believe that no data was actually written during
an FSSetForkSize, and that any new blocks allocated to the file
would simply contain the block's previous content.
On further reflection, this behavior is probably a security concern.
The File Manager APIs tend to have somewhat different implementations for
HFS-like file systems and the rest. I'll bet that you're seeing that extra
writing of zeroes because the File Manager can't tell whether the network file
system (and the file system on the server itself) supports sparse files, and
because it is trying to be compatible the the historic behavior of
FSSetForkSize on HFS where it not only sets the size of the fork, but also
allocates all of that new space.
Again, HFS+ on OS X to OS X via Apple FIle Protocol.
What's really weird (to me) is that the client is the one doing
the writing. In exactly the same configuration, the client OS X
system can determine that the destination AFP file server is
capable of copying files locally (a feature I use often) and
will off-load the entire file duplication to the server. But it
can't figure out that the server supports something as simple as
"set EOF" so it has to write a billion zeros over the network????
Next I tried to use FSAllocateFork on the network volume and it returns
almost instantly, but it also doesn't appear to do anything.
Testing the free space on the volume before and after the
FSAllocateFork call
returns the same value. Setting the fork size after the FSAllocateFork call
produces the same results as before. So it doesn't appear that anything was
actually allocated by FSAllocateFork.
From the documentation for FSAllocateFork:
The FSAllocateFork function attempts to allocate requestCount bytes of
physical storage starting at the offset specified by the positionMode
andpositionOffset parameters. For volume formats that support preallocated
space, you can later write to this range of bytes (including extending the
size of the fork) without requiring an implicit allocation.
Notice the "attempts to allocate" wording. HFS is one of the few file systems
that is capable of having the physical file size be significantly larger than
the logical size (more than just rounding up to a multiple of an allocation
block). Most file systems cannot physically allocate extra space without also
changing the logical EOF. In the case of the network volume, I would expect
the actualCount output to be zero, indicating that it didn't actually allocate
any additional physical space.
The actualCount value returned for the network volume is the
same as the requested value, indicating a successful allocation.
In my test case these were values between 100MB and 160GB. In
every test, the free space reported by the volume remained the same.
On the local drive, the actualCount returned was also the same
as the requested value, but the free space on the volume dropped accordingly.
Sorry, but with the wide variety of file systems out there, it's pretty much
impossible to guarantee that you have set aside space for a file to grow,
without actually writing to the file. If you were going through the POSIX
APIs instead, you could potentially set the file size without incurring I/O,
but then the file might be sparse and a subsequent write could fail because
you ran out of space.
I'm willing to accept that. With other file systems, I know that
my application might have to physically write a bunch of bytes.
But on OS X talking to OS X using an HFS+ filesystem, I'd like
to know
(a) why the fill data isn't written locally by the server
and
(b) why FSAllocateFork doesn't seem to do anything
We've generally advised applications to use FSAllocateFork (or the equivalent
at other API layers) to try to reserve the space up front, and then just write
to the file when they need to, and handle errors appropriately. When you
happen to be using HFS, you end up with the behavior you desire. For other
file systems, avoiding the FSSetForkSize prevents the gigantic write of zeroes
that you've observed, but may lead of out-of-space errors as you write the
file.
But I can't use FSAllocateFork because if it's broken on file
servers. ;)
And this isn't my first run in with FSAllocateFork. It was
horribly broken in 10.4, was also broken in 10.5 file sharing,
and is still broken in whatever version of AFP ships with the
Airport Basestation and Time Capsule (bug #4843619). In that
bug, the problem is that FSAllocateFork miscalculates the size
of the allocation by always adding the size of the existing file
to the allocation. This is moot if the file length is zero, but
not if the file already contains data.
My workaround was to use FSSetForkSize instead. Oh, the irony...
--
James Bucanek
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden