Re: Request For Comments On File Insert API
Re: Request For Comments On File Insert API
- Subject: Re: Request For Comments On File Insert API
- From: Brian Pinkerton <email@hidden>
- Date: Fri, 21 Mar 2008 13:11:13 -0700
It's an interesting idea, and it's pretty easy to see how it could
give some applications a big boost. But for the IO-intensive
applications that I work on (mostly full-text indexing) the alignment
constraints would be too coarse-grained to be useful. Also, with this
scheme, the portability of a file across file systems would depend on
it using the max common alignment unit. If a file used something
smaller an application might have to re-write the file to achieve good
performance on a file system with different alignment constraints.
Another concern is compatibility across file systems on the Mac.
Already today we have situations where a Mac application will work
fine on an HFS file system but fail on, say, a ZFS file system on the
same machine. Network filesystems add another set of constraints. If
it were likely (and I imagine it would be) that this new API would not
be supported on all file systems then I can imagine developers making
the sensible choice to avoid it. In other words, if you have to
design your file formats and algorithms to achieve good performance
without this feature, then it's hard to imagine that it's worth
writing the special case for file systems that do have it.
bri
On Mar 20, 2008, at 3:40 AM, Quinn wrote:
Greetings All
Mac OS X file system engineering is considering a new API for
inserting data in the middle of a file. Before they make concrete
plans, they'd like to solicit feedback from the API's intended
audience, that is, you. Some information and questions from file
system engineering are pasted in below. Please send your response
to me and I'll collate and forward them.
Share and Enjoy
Quinn
* * *
In the traditional file system concept, data in a file is treated as
a byte stream that can only be updated, appended or truncated. I
would like to know what benefit applications will get if this
concept is extended a little, so that chunks of data can be
efficiently inserted into the beginning or middle of the file.
These chunks of data can only be inserted at offsets that conform to
some alignment unit, and the size of the inserted data must also be
a multiple of the alignment unit. The alignment unit value may vary
depends on OS type, file system, and options used when formatting
the file system. Typically it would be the Least Common Multiple of
the VM page size (4KB) and file system minimum allocation size. For
example, the alignment unit would be 4KB for commonly formatted HFS
Plus volumes. But, on a MS-DOS/FAT volume with a cluster size of
16KB, the alignment unit would be 16KB.
The API might look like:
// inserts size bytes of zeros into the file at offset off
// both off and size must be on the specific alignment boundaries of
the file system
// use statfs to query the alignment unit of given file system. // A
reserved field in struct statfs will be used to return the alignment
unit size
int finsert_np(int fd, off_t off, off_t size);
In order to make the implementation at the file system level less
tricky, the following three restrictions are expected:
1. The file must be opened with O_EXLOCK or locked exclusively with
flock.
2. No part of the file can be memory mapped.
3. During the insert call, all cached file data whose offsets were
larger than the insertion point will be flushed to disk (if they are
dirty) and discarded from memory.
The fd's file offset is not changed after the insert. Since data
shifts after the insert, the application should be careful when
using the file offset.
I recognise that this API is not POSIX compliant and won't be
available on all systems. However, if this API is highly valuable
to certain applications, it's something that we'd consider doing.
I would appreciate your comments about the following questions:
1. Would this API benefit your application or any application you
know?
2. Do the above three restrictions pose serious limitation on your
use of this API?
3. We could also implement an fdelete_np API, to delete chunks of an
aligned size at an aligned offset. Would this benefit your
application or any application you know? Do the above three
restrictions pose serious limitation on your use of such an API?
* * *
--
Quinn "The Eskimo!" <http://www.apple.com/developer/
>
Apple Developer Relations, Developer Technical Support, Core OS/
Hardware
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden