Re: Calling all Filesystem guru's: Help!
Re: Calling all Filesystem guru's: Help!
- Subject: Re: Calling all Filesystem guru's: Help!
- From: Brian Bergstrand <email@hidden>
- Date: Tue, 22 Jun 2004 21:35:51 -0500
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Well, thanks to Sam, I've found this bug and it is in the kernel:
Reported as bug 3707337: "Crash/Data loss when FS block size is < 4KB
because of bug in cluster_io()."
Summary:
I found a bug in cluster_io() when writing sparse files to my
filesystem (Ext2 for OS X:
http://sourceforge.net/projects/ext2fsx/):
My own bug report can be found at :
http://sourceforge.net/tracker/index.php?
func=detail&aid=965119&group_id=64713&atid=508406
Basically, cluster_io() will overflow the var it uses to track the size
of the
IO when a sparse file is read or written to a filesystem with a block
size
< PAGE_SIZE (4096 in all current Darwin releases). This causes
continuous calls to VOP_CMAP() and it can also cause data to be lost
during a write.
I suspect this same bug would happen with Apple UFS if it allowed < 4KB
block sizes, but it currently doesn't.
Steps to reproduce:
Write any sparse file to file system.
Details:
I made an internal copy of cluster_io(), and put a bp on VOP_CLOSE()
and traced into my cluster_io() copy.
On a 1KB block filesystem, I wrote 17 bytes at offset 0, and then the
same 17 bytes at offset 2048. This created a 2065 byte file and left a 1
block hole in the file:
i_size = 2065
i_db = {547, 0, 548, 0, 0, 0, 0, 0, 0, 0, 0, 0},
i_ib = {0, 0, 0},
cluster_push(vp=0x29ae570)
cluster_try_push (vp=0x29ae570, EOF=2065, can_delay=0, push_all=1)
cluster_push_x (vp=0x29ae570, EOF=2065, first=0, last=1,
can_delay=0)
cluster_io (vp=0x29ae570, upl=0x29a4d80, upl_offset=0, f_offset=0,
non_rounded_size=2065, devblocksize=512, flags=38, real_bp=0x0,
iostate=0x0)
// flags = CL_COMMIT | CL_AGE | CL_ASYNC
388 size = (non_rounded_size + (devblocksize - 1)) & ~(devblocksize
- - 1);
(gdb) p size
$36 = 2560
In the while loop on line 443 (vfs_cluster.c - xnu-517.7.7)
The first time through the loop, VOP_CMAP() replies with:
(gdb) p blkno
$40 = 1094
(gdb) p io_size
$41 = 1024
And we get to the end of the loop with:
(gdb) p upl_offset
$55 = 1024
(gdb) p f_offset
$56 = 1024
(gdb) p size
$57 = 1536
This is all fine.
Now the second time through the loop, we are on the sparse hole and
VOP_CMAP() returns:
(gdb) p blkno
$61 = -1
(gdb) p io_size
$62 = 1024
So now the following is true, and we enter that condition block:
if ( (!(flags & CL_READ) && (long)blkno == -1) || io_size == 0) {
....
482 upl_offset += PAGE_SIZE_64;
(gdb)
483 f_offset += PAGE_SIZE_64;
(gdb)
484 size -= PAGE_SIZE_64;
(gdb)
485 continue;
}
AndB this is where the trouble happens:
(gdb) p upl_offset
$66 = 5120
(gdb) p f_offset
$67 = 5120
(gdb) p size
$68 = 4294964736
So not only does the size overflow, but the block after the hole is
skipped!
If io_size was used instead, then everything would be correct:
upl_offset += io_size == 2048
f_offset += io_size == 2048
size -= io_size == 512
In fact, this is what is done in the READ case:
510 if ((flags & CL_READ) && (long)blkno == -1) {
...
575 upl_offset += io_size;
576 f_offset += io_size;
577 size -= io_size;
}
Isolation:
I've also reproduced this bug on 10.3.3, and 10.2.8.
On Jun 20, 2004, at 1:19 PM, Sam Vaughan wrote:
>
Hi Brian,
>
>
My first guess would be that you're underflowing the size variable in
>
the while loop in cluster_io(). It's fairly easy to do, trust me ;-)
>
If you're returning a blkno of -1 for a hole and the size has
>
underflowed, you end up in a loop with size working its way down from
>
2^32 one page at a time and f_offset working its way up just as you
>
described.
>
>
I can't seem to get web access working in this hotel room right now,
>
so I can't follow the links you posted. Send me an email with some
>
more details attached if you like.
>
>
Sam
>
>
> Okay, I'm at my wits end on a bug in the ext2 filesystem for OS X
>
> <http://sourceforge.net/projects/ext2fsx>:
>
>
>
> When writing a sparse file (even one with just one hole), the fs can
>
> hang or panic (with a Disk Image) the system.
>
>
>
> I've tracked this down to cluster_io() repeatedly calling VOP_CMAP
>
> with
>
> higher and higher offsets (way past the file size) that eventually
>
> cause our logical block address to overflow (this is much more likely
>
> on <4KB block FS), thus making a negative offset which VOP_BMAP treats
>
> as an indirect block address and I assume causes a weird page to be
>
> read and passed to the storage driver by cluster_io/VOP_STRATEGY.
>
>
>
> Anyway, I can't figure why this happens. I've gone through everything
>
> I
>
> can think of:
>
>
>
> <http://sourceforge.net/tracker/index.php?
>
> func=detail&aid=965119&group_id=64713&atid=508406>
>
Brian Bergstrand <
http://www.bergstrand.org/brian/>, AIM: triryche206
PGP Key: <
http://www.bergstrand.org/brian/misc/public_key.txt>
A society that will trade a little liberty for a little order will lose
both, and deserve neither. - Thomas Jefferson
As of 09:32:00 PM, iTunes is playing "Cry Freedom" from "Crash" by
"Dave Matthews Band"
-----BEGIN PGP SIGNATURE-----
Version: PGP 8.1
iQA/AwUBQNjeeXnR2Fu2x7aiEQLY1wCfcwsp/UajJX4rz3JBNf2RN/bLNjEAn1fp
9cqSvNFN3ANDAHbxg3DPL3hp
=urtE
-----END PGP SIGNATURE-----
_______________________________________________
darwin-kernel mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/darwin-kernel
Do not post admin requests to the list. They will be ignored.