Re: [Q] large flat files usage
Re: [Q] large flat files usage
- Subject: Re: [Q] large flat files usage
- From: Mike Laster <email@hidden>
- Date: Thu, 20 Jun 2002 20:33:29 -0400
On 6/19/02 4:13 PM, "Phillip Morelock" <email@hidden>
wrote:
>
Hello List,
>
>
In the days of wayback, I worked with flat files for CGI websites. Since
>
dynamic content was very light in those days, flat files were fine, even
>
though they were slow. They just never got big enough to be a problem.
>
>
Now I'm in a situation where I have some enormous flat files to work with
>
(we're talking 5, 10 megs, etc.). I am looking for some (either unix-based
>
or even cocoa-based) ways for dealing with that kind of data on the "back
>
end" of a local application, so to speak. (not a webserver).
>
>
Basically, I want what appears to be the holy grail of flat-file usage:
>
1) minimize in-memory footprint -- so holding a 10 meg flat file all in
>
memory is not really cool.
>
2) make both read and write speeds as unnoticeable as possible
>
3) be able to display records in, say, an NSTableView that would possibly
>
only have the "visible" rows calculated/resident (to conserve resources), or
>
some similar scheme,
>
4) to have those visible records editable and have the changes be reliably
>
and quickly written out
>
>
I admit that I'm at the end of my rope experience-wise as far as dealing
>
with this, so I would love some man pages or articles to get started with.
>
I could potentially create some sort of indexing byte-offset scheme, as the
>
files are sorted (mostly ascii-betically, I think). This would allow me to
>
use fopen, etc. calls to get to a certain offset in the file without using
>
too much memory, correct?
>
>
Of course I am hoping there's some "dyno-MITE" Cocoa way of doing this, but
>
a little messy C library stuff is just fine with me. Maybe I should just
>
wrap a perl program (not really doing much for my memory requirements,
>
tho...).
>
>
As you can see, I'm in need of some general direction. I've tried googling
>
and poking around man fopen, etc., but "working with large flat files" etc.
>
searches on google get you to a lot of commercial product "marketeching"
>
pages.
I still do it the "messy C" way myself. I have two approaches. My "small"
(341MB) file is read-only so I map it into memory with mmap(). You could
probably use -[NSData dataWithContentsOfMappedFile:], but when I originally
wrote the code on OS X Server 1.2, that method didn't actually memory map
:-) Be careful of how you arrange your internal structures. You can't rely
on pointers anywhere, since you can't guarantee that the file will be mapped
to the same address every time. I do everything with relative offsets. If
you want to modify the data, memory mapping may not be the road for you,
since I think it is implicitly read-only.
My more complex structure allows reading and writing. Basically, it's like
a big NSDictionary (key value pairs) with the value being a fixed-size C
structure. Due to filesystem bugs in OS X Server 1.2, I had to split the
logical file into multiple physical files because data beyond 4GB would get
truncated. So my logical file is about 7.7GB split across an index file
(hash table) and N data files guaranteed to never get larger than 4GB each.
Access boils down to calls to fseek()/read()/write() with the uglyness
hidden in a nice Objective-C wrapper.
You can hide it behind an API like:
- (void) readBuffer:(void *)inBuffer length:(unsigned long) inLength
fromLocation:(off_t) inLocation;
- (void) writeBuffer:(void *)inBuffer length:(unsigned long) inLength
toLocation:(off_t) inLocation;
And implement everything in terms of this API.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.