Re: Calling all Filesystem guru's: Help!

23 Jun 2004

      -----BEGIN PGP SIGNED MESSAGE-----

Hash: SHA1

Well, thanks to Sam, I've found this bug and it is in the kernel:

Reported as bug  3707337: "Crash/Data loss when FS block size is < 4KB

because of bug in cluster_io()."

Summary:

I found a bug in cluster_io() when writing sparse files to my

filesystem (Ext2 for OS X: http://sourceforge.net/projects/ext2fsx/):

My own bug report can be found at :

http://sourceforge.net/tracker/index.php?

func=detail&aid=965119&group_id=64713&atid=508406

Basically, cluster_io() will overflow the var it uses to track the size

of the

IO when a sparse file is read or written to a filesystem with a block

size

< PAGE_SIZE (4096 in all current Darwin releases). This causes

continuous calls to VOP_CMAP() and it can also cause data to be lost

during a write.

I suspect this same bug would happen with Apple UFS if it allowed < 4KB

block sizes, but it currently doesn't.

Steps to reproduce:

Write any sparse file to file system.

Details:

I made an internal copy of cluster_io(), and put a bp on VOP_CLOSE()

and traced into my cluster_io() copy.

On a 1KB block filesystem, I wrote 17 bytes at offset 0, and then the

same 17 bytes at offset 2048. This created a 2065 byte file and left a 1

block hole in the file:

i_size = 2065

i_db = {547, 0, 548, 0, 0, 0, 0, 0, 0, 0, 0, 0},

i_ib = {0, 0, 0},

cluster_push(vp=0x29ae570)

cluster_try_push (vp=0x29ae570, EOF=2065, can_delay=0, push_all=1)

cluster_push_x (vp=0x29ae570, EOF=2065, first=0, last=1,

can_delay=0)

cluster_io (vp=0x29ae570, upl=0x29a4d80, upl_offset=0, f_offset=0,

non_rounded_size=2065, devblocksize=512, flags=38, real_bp=0x0,

iostate=0x0)

// flags = CL_COMMIT | CL_AGE | CL_ASYNC

388      size = (non_rounded_size + (devblocksize - 1)) & ~(devblocksize

- - 1);

(gdb) p size

$36 = 2560

In the while loop on line 443 (vfs_cluster.c - xnu-517.7.7)

The first time through the loop, VOP_CMAP() replies with:

(gdb) p blkno

$40 = 1094

(gdb) p io_size

$41 = 1024

And we get to the end of the loop with:

(gdb) p upl_offset

$55 = 1024

(gdb) p f_offset

$56 = 1024

(gdb) p size

$57 = 1536

This is all fine.

Now the second time through the loop, we are on the sparse hole and

VOP_CMAP() returns:

(gdb) p blkno

$61 = -1

(gdb) p io_size

$62 = 1024

So now the following is true, and we enter that condition block:

if ( (!(flags & CL_READ) && (long)blkno == -1) || io_size == 0) {

	....

482                            upl_offset += PAGE_SIZE_64;

(gdb)

483                            f_offset   += PAGE_SIZE_64;

(gdb)

484                            size       -= PAGE_SIZE_64;

(gdb)

485                            continue;

}

AndB this is where the trouble happens:

(gdb) p upl_offset

$66 = 5120

(gdb) p f_offset

$67 = 5120

(gdb) p size

$68 = 4294964736

So not only does the size overflow, but the block after the hole is

skipped!

If io_size was used instead, then everything would be correct:

upl_offset += io_size == 2048

f_offset += io_size == 2048

size -= io_size == 512

In fact, this is what is done in the READ case:

510   if ((flags & CL_READ) && (long)blkno == -1) {

...

575            upl_offset += io_size;

576	           f_offset   += io_size;

577            size       -= io_size;

}

Isolation:

I've also reproduced this bug on 10.3.3, and 10.2.8.

On Jun 20, 2004, at 1:19 PM, Sam Vaughan wrote:
...
Hi Brian,
...

...
My first guess would be that you're underflowing the size variable in
...
the while loop in cluster_io().  It's fairly easy to do, trust me ;-)
...
If you're returning a blkno of -1 for a hole and the size has
...
underflowed, you end up in a loop with size working its way down from
...
2^32 one page at a time and f_offset working its way up just as you
...
described.
...

...
I can't seem to get web access working in this hotel room right now,
...
so I can't follow the links you posted.  Send me an email with some
...
more details attached if you like.
...

...
Sam
...

...
...
Okay, I'm at my wits end on a bug in the ext2 filesystem for OS X
...
...
<http://sourceforge.net/projects/ext2fsx>:
...
...
...
...
When writing a sparse file (even one with just one hole), the fs can
...
...
hang or panic (with a Disk Image) the system.
...
...
...
...
I've tracked this down to cluster_io() repeatedly calling VOP_CMAP
...
...
with
...
...
higher and higher offsets (way past the file size) that eventually
...
...
cause our logical block address to overflow (this is much more likely
...
...
on <4KB block FS), thus making a negative offset which VOP_BMAP treats
...
...
as an indirect block address and I assume causes a weird page to be
...
...
read and passed to the storage driver by cluster_io/VOP_STRATEGY.
...
...
...
...
Anyway, I can't figure why this happens. I've gone through everything
...
...
I
...
...
can think of:
...
...
...
...
<http://sourceforge.net/tracker/index.php?
...
...
func=detail&aid=965119&group_id=64713&atid=508406>
...
Brian Bergstrand <http://www.bergstrand.org/brian/>, AIM: triryche206

PGP Key: <http://www.bergstrand.org/brian/misc/public_key.txt>

A society that will trade a little liberty for a little order will lose

both, and deserve neither. - Thomas Jefferson

As of 09:32:00 PM, iTunes is playing "Cry Freedom" from "Crash" by

"Dave Matthews Band"

-----BEGIN PGP SIGNATURE-----

Version: PGP 8.1

iQA/AwUBQNjeeXnR2Fu2x7aiEQLY1wCfcwsp/UajJX4rz3JBNf2RN/bLNjEAn1fp

9cqSvNFN3ANDAHbxg3DPL3hp

=urtE

-----END PGP SIGNATURE-----

_______________________________________________

darwin-kernel mailing list | darwin-kernel@lists.apple.com

Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel

Do not post admin requests to the list. They will be ignored.