Re: MemoryMapping Large Files - ???
Re: MemoryMapping Large Files - ???
- Subject: Re: MemoryMapping Large Files - ???
- From: Kurt Revis <email@hidden>
- Date: Sat, 12 Jan 2002 08:52:54 -0800
Christopher Holland <email@hidden> sez:
I am having a bit of a problem mapping a rather "large" file to an
NSData
object. I can map smaller files without problem, but the larger one
pukes
with the following error:
-[NSConcreteData initWithBytes:length:copy:freeWhenDone:bytesAreVM:]:
absurd length: 1866240000
Of course the length of the file is almost 2 gigs so I realized that it
might
have problems before I got started. OSX is supposed to be able to handle
files larger than 2 gigs, correct?
Reading and writing them in pieces, yes, but trying to load all that
into memory at once may not work.
I'm using the following code:
bigData = [[NSData alloc] initWithContentsOfMappedFile:bigDataPath];
I've tried using 'initWithContentsOfFile' also...just to see if it was
the
memory mapping doing it.....no go there.
Should I use the BSD 'mmap' funtion instead of using the 'NSData'
methods
above?
You certainly could try, but if you try to map the whole file at once,
you will probably encounter the same problem. Fortunately mmap lets you
specify a range of the file to map, which the NSData methods do not.
The problem, essentially, is that the system uses 32-bit addressing, so
there are at most 4 gigabytes of memory that can be accessed by your
program at once. (This is independent of the physical memory in the
system. The VM system is responsible for mapping this address space into
ranges of physical memory, and swapping when necessary.) And part of
this address space is already taken up by your code and the system
libraries' code and other memory needed by the runtime. Usually your
code and static data loads at the bottom of this address space (0 and
up), and currently the system libraries are loading around 0x70000000
and 0x80000000. This splits the available address range into two parts,
neither of which can hold as much memory as you're asking for.
(Yes, it's fragmentation! And we thought we'd never see these problems
on the Mac again. I don't know why it was decided to load system
libraries into the middle of the address space; probably there is a good
reason.)
Moving to a processor and VM system which can handle 64-bit addresses
would solve this problem (for the moment); maybe that will happen
someday. For the vast majority of programs, the available address space
is big enough, which is why 32-bit systems are still in use.
If you *really* need random access to any of that two gigs of data at
one time, to make it work well, you're probably going to end up writing
your own "virtual memory" system, using mmap to swap pieces of the file
in and out of physical memory, along with some clever caching scheme.
Since you know more about how you'll be accessing the data than the
system VM does, you can probably write something that will give you
better performance than it would.
(VM gurus: Yes, I know, this is a vast oversimplification, and please
pipe up if I'm completely wrong.)
Hope this helps.
--
Kurt Revis
email@hidden