Re: Problem with fcntl/F_LOG2PHYS (was: Re: How to read files from disk directly?)
Re: Problem with fcntl/F_LOG2PHYS (was: Re: How to read files from disk directly?)
- Subject: Re: Problem with fcntl/F_LOG2PHYS (was: Re: How to read files from disk directly?)
- From: Stan Sieler <email@hidden>
- Date: Thu, 07 Jul 2011 11:41:14 -0700
Mark had some very good comments:
> That looks suspiciously like the block number it got back from the file system was -1. A block number of -1 from VNOP_BLOCKMAP means a "hole" -- that portion of the file is sparse. HFS and HFS Plus don't implement sparse files. But they do have two features which might trigger this behavior.
>
That's my suspicion, too. Mark mentioned:
> The first is delayed allocation. If there is sufficient free space, it won't bother immediately trying to figure out
and
> The second is delayed zero filling. If you grow the size of a file without writing to that new space, then we need to
I then realized that all the odd files were relatively new ... so I modified my
test app to do fsync() (thanks, Mark!) before the fcntl/F_LOG2PHYS.
Success!
The big file (about 730 MB) took about 30 seconds to fsync ... when it was done,
fcntl/F_LOG2PHYS said it occupies a single chunk of contiguous disk space of
1,073,741,824 bytes. ("ls -l" still says it's 738,197,504 bytes in size.)
(Why it's 1 GB is another question ... I presume/hope that it's some kind of "soft" allocation, and that
if the file system fills up (and the file is closed) that the extra 300 MB would be used for other
allocations.)
All seven files with the 0xfffffffffffff000 now report valid disk addresses.
> zero fill it for security purposes. But if you're just going to write to it a little later, writing all of those zeroes is a big waste. So we keep track of newly allocated, but not yet written, ranges within the file. Basically, we
(A concept I know as "virgin free space", introduced to me by MPE XL around 1984 ...
good to hear the the Mac has it, thanks!)
> temporarily pretend like the file is sparse. And in fact, we return block number -1 from VNOP_BLOCKMAP in this case so that other parts of the OS will automatically return zeroes to user space. (Note: those zeroes *do* get written eventually.)
(A seldom used feature of MPE XL (later MPE/iX) is that the file creator could specify the fill character ...
and it defaulted to zero for binary files, and ASCII blank for ASCII files.)
>
> As a workaround, you might try calling fsync() on the file descriptor before you call fcntl(... F_LOG2PHYS...). I think that will force the zero filling to happen immediately, and you should then get the real on-disk location. Note that doing so could cause a performance problem.
It's quick for fully-allocated files, and slow (as I found out :) on some sparsely allocated large files.
(So, I wouldn't do it for normal use.)
>
> Reads from sparse areas of a file do not cause those areas to be allocated. You just get zeroes back. You have to write to the sparse areas to cause them to be allocated. But as I mentioned above, HFS does not support sparse files persistently.
>
Good to know ... my prior experience with sparse files didn't have an allocation difference between reading
a non-allocated page vs. writing to a non-allocated page. (I'd submit an enhancement request for MPE/iX, but it would be years too late :)
>> The kernel might be doing:
>> which might imply that "bn" (from VOP_BMAP) is bad.
>
> Those VOP_xxx names indicate you're using a really old version of the source. We switched to VNOP_xxx routines in Mac OS X 10.4.
I don't know where the current user accessible source is ... I was using an older source I found via google
mostly to indicate a probable source for the strange data (that "bn" was -1). It seemed likely that the current code, which I hadn't seen, was probably similar. I mostly included it as something that might jog someone's memory.
In summary, the bug is in the documentation for fcntl F_LOG2PHYS, which should probably have text added like:
In cases where the file's current offset does not have disk space allocated, the value
0xfffffffffffff000 will be returned in the l2p_devoffset field. Note that this is
fairly rare. If you wish to ensure that all in-use pages of a file have been allocated
disk storage, call fsync() on the file prior to fcntl ( , F_LOG2PHYS) (note: this may
have performance implications).
I'll submit it ... done, Bug ID 9738333.
Thanks for hitting the nail on the head, Mark!
Stan
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden