Re: How to read files from disk directly?
Re: How to read files from disk directly?
- Subject: Re: How to read files from disk directly?
- From: Shantonu Sen <email@hidden>
- Date: Mon, 4 Jul 2011 08:57:41 -0700
Have you proven that seek penalty is your bottleneck?
For example, if you copy a filesystem hierarchy to a freshly formatted filesystem, the files will be allocated on disk in the order you copied them. After rebooting (to remove caching effects), if you process every file in the target filesystem in the same order that you copied them, what is the performance?
Also note that access the raw device in a coherent fashion requires unmounting the filesystem, which might make such a solution unviable depending on your use case.
Shantonu
On Jul 4, 2011, at 5:16 AM, Eric Gorr wrote:
>
> On Jul 3, 2011, at 11:51 PM, Shantonu Sen wrote:
>
>> Defragmenting applications use the volume format described as <http://developer.apple.com/library/mac/#technotes/tn/tn1150.html>
>
> Thank you.
>
>> This is not the same thing as the logical filesystem that is exposed to userspace. For instance, filesystem compression (introduced in Mac OS X 10.6 Snow Leopard) can prevent you from being able to reconstruct a file's contents by accessing the raw device.
>
> Interesting. Is the filesystem compression optional or is it always there for everyone?
>
> If it was optional, then I assume the situation could be detected and the slow route of using the standard APIs to open and read the files could be the fallback.
>
>> What are you really trying to do?
>
> It's fairly easy to describe. For each individual file:
>
> 1. Open the file
> 2. Read the data from the file
> 3. Do some simple processing on that data.
> 4. Close the file
>
> The issue is that there are 1 million + files and the vast majority of them are small (only a few kb). Based on my tests, the overhead of opening and reading the data for every single file is significant and I figured, if it were practical, that it would be nice to be able to read a few hundred of them in one shot and send them off to a worker thread for processing.
>
> But, it sounds like what you are saying is that this is impossible or at least so impractical that it should not be attempted.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden