Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

How can I access mnt_devblocksize from user space?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How can I access mnt_devblocksize from user space?

Subject: How can I access mnt_devblocksize from user space?
From: Sam Vaughan <email@hidden>
Date: Tue, 23 Sep 2008 16:19:41 +1000

When getting direct I/O running as fast as possible, it's important to align the file offsets of every request to avoid the kernel having to call cluster_copy_upl_data to uiomove everything. The performance penalty of that is very high and should be easily avoidable.

I wrote a very simple C program to play around with that opens a file, sets F_NOCACHE on it and starts issuing 1MB preads from an offset passed in from the command line, stopping when it hits EOF. The destination buffer for all the reads is always page aligned.

Running the test tool with the 'time' Bash built-in or monitoring it with Shark or dtrace quickly shows the problem. If the initial offset is zero, the reads are fast and the kernel CPU usage is very low. If the initial offset is something nasty, cluster_copy_upl_data gets involved, kernel CPU usage shoots up and the reads are slow.

For a long time I'd simply assumed that as long as the memory was page aligned and the disk offset was 512 byte sector aligned, no copies would ever be needed. Then about a year ago I was working on code to read 2k uncompressed video and I discovered that on many RAIDs, the alignment needs to be to 4k offsets in the file to avoid the copies occurring.

What I'd like to know is whether this alignment requirement for any given volume is easily accessible from user space, because I'd like to set it dynamically.

Empirical testing using my little C program shows that my build machine's local disk only requires 512 byte alignment to avoid the copies, but my laptop, my home machine's software RAID and my test machine's hardware RAID all require 4k alignment.

I've been using a dtrace script to detect calls to cluster_copy_upl_data because the backtraces in Shark (and indeed the output from a call to stack() in dtrace) seem so untrustworthy. They both claim that cluster_read_ext calls cluster_pageout for instance! Anyway, after reading some cluster_vfs code I added a line to my dtrace script to save off vp->v_mount->mnt_devblocksize when cluster_read_ext is entered. Sure enough, it contains the correct magic value wherever I run my test. (dtrace really is awesome :o)

Looking in stat, statfs and getattrlist, I haven't been able to find a field that exposes this value to user space. Browsing through xnu in cscope, the getvolattrlist function looks promising, but it turns out that it will only return mnt_devblocksize if the user asked for f_bsize and the file system doesn't support that attribute.

I'm wondering if I've missed something obvious in the above APIs, or whether there's a better way to get at the mnt_devblocksize field of a mount_t structure from user space. Has anyone tried to do this before, or is the general idea to simply go with 4k alignment and leave it at that?

Thanks in advance for any ideas,

Sam

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: How can I access mnt_devblocksize from user space?
From: Jim Luther <email@hidden>


Prev by Date:
Re: Get proper volume name with FSGetVolumeInfo

Next by Date:
Re: How can I access mnt_devblocksize from user space?

Previous by thread:
Re: Get proper volume name with FSGetVolumeInfo

Next by thread:
Re: How can I access mnt_devblocksize from user space?

Index(es):

Date
Thread