Re: How to read files from disk directly?
Re: How to read files from disk directly?
- Subject: Re: How to read files from disk directly?
- From: Toby Thain <email@hidden>
- Date: Mon, 04 Jul 2011 08:38:47 -0400
On 04/07/11 8:16 AM, Eric Gorr wrote:
>
> On Jul 3, 2011, at 11:51 PM, Shantonu Sen wrote:
>
>> Defragmenting applications use the volume format described as <http://developer.apple.com/library/mac/#technotes/tn/tn1150.html>
>
> Thank you.
>
>> This is not the same thing as the logical filesystem that is exposed to userspace. For instance, filesystem compression (introduced in Mac OS X 10.6 Snow Leopard) can prevent you from being able to reconstruct a file's contents by accessing the raw device.
>
> Interesting. Is the filesystem compression optional or is it always there for everyone?
>
> If it was optional, then I assume the situation could be detected and the slow route of using the standard APIs to open and read the files could be the fallback.
>
>> What are you really trying to do?
>
> It's fairly easy to describe. For each individual file:
>
> 1. Open the file
> 2. Read the data from the file
> 3. Do some simple processing on that data.
> 4. Close the file
>
> The issue is that there are 1 million + files and the vast majority of them are small (only a few kb). Based on my tests, the overhead of opening and reading the data for every single file is significant and I figured, if it were practical, that it would be nice to be able to read a few hundred of them in one shot and send them off to a worker thread for processing.
>
Have you considered putting the files on an SSD or RAM disk?
Assuming the files have to live on spinning storage, most of the
difference between streaming and random access is in seek latencies.
Your idea of streaming the raw filesystem (to RAM, presumably) THEN
making random-like access to the individual files helps, but the devil
being in the details, etc, may involve a lot of work. What if you change
the underlying medium in the first place?
I'm not sure you have told us enough about the bigger picture yet.
--Toby
> But, it sounds like what you are saying is that this is impossible or at least so impractical that it should not be attempted. _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden