• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to read files from disk directly?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to read files from disk directly?


  • Subject: Re: How to read files from disk directly?
  • From: Ken Hornstein <email@hidden>
  • Date: Tue, 05 Jul 2011 12:53:56 -0400

>Reading the catalog would require unmounting the filesystem, with all the
>caveats that have previously been mentioned. The only savings would be that
>you wouldn't have to call fcntl(F_LOG2PHYS) on every allocation block in the
>file. But the tradeoff is that you would need to read the extents overflow
>file and stitch together extents for fragmented files.

I guess I was thinking (since the files are small) you'd just sort based on
the first extent, which I think is just in the catalog.  And I fully admit
that unless you were willing to live with some inconsistency you'd have
to unmount the filesystem to get a consistent view of the catalog.  And
I ALSO fully admit that this completely falls in the category of "crazy-assed
bad ideas" :-)

>/.vol has more overhead than full paths. The kernel converts file ID lookups
>to their full path so that permissions/ACL for each path component can be
>applied.

Shows what I know!

You know ... the more I think about this, the more I realize ... are things
really this bad?

I have a development tree here on my desktop; it's got approximate
220,000 files in it, with a total size of 2.5 G.  So, that's an average
of roughly 11K per file.

The _first_ ls -lR takes some time, but subsequent ones take around 13-14
seconds.  If I skip the -l and just do "ls -1R" then the time drops to
under a second.

On this tree, I did this:

% find . -type f -print | xargs cat > /dev/null

And that completed in 229 seconds (this data wasn't cached, it was the first
time I had used this tree in a long time).

So 229 seconds to read about 2.5 GB from 220,000 files?  That sounds
okay to me.  But that's not anywhere close to 20-25 minutes to read 4
GB from 190,000 files (what Eric quoted earlier).  Obviously we have a
ton of differences between our systems, but that seems like something
else is going on.  But .... hm.  Reading 2.5 GB takes me 29 seconds, so
... slightly less than a factor of 8x slower?  I guess that's pretty
close to what you're seeing (but 4 GB taking 3 minutes seem too long
assuming some kind of modern hardware).

--Ken
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >Re: How to read files from disk directly? (From: Shantonu Sen <email@hidden>)

  • Prev by Date: Re: How to read files from disk directly?
  • Next by Date: Re: How to read files from disk directly?
  • Previous by thread: Re: How to read files from disk directly?
  • Next by thread: Re: How to read files from disk directly?
  • Index(es):
    • Date
    • Thread