Re: Is NSMutableArray slow?
Re: Is NSMutableArray slow?
- Subject: Re: Is NSMutableArray slow?
- From: Dietmar Planitzer <email@hidden>
- Date: Thu, 2 Oct 2003 13:39:55 +0200
On Wednesday, October 1, 2003, at 11:11 PM, Chaz McGarvey wrote:
Hello:
If I had an operation that produced a lot of integers (>100000)
quickly and I wanted to have the integers when the operation finished,
would putting each integer into an NSNumber and then putting the
NSNumber onto the end of an NSMutableArray for each integer going to
be a bottleneck?
Yes, there is a good chance that this would become a bottleneck.
Because this kind of question comes up every few weeks, let's take a
closer look at what it means to (ab)use an NSArray as a data storage
for primitive data types like integers, floats & co.
The following assumes that we want to store and manipulate 100.000
32bit integers.
A) Memory requirements
------------------------
1) NSMutableArray:
In this case each single integer must be wrapped in an NSNumber object
which is then added to an NSMutableArray. The array itself requires a
few bytes for its header plus, more importantly, 4 bytes per integer,
because it must store a pointer to an NSNumber instance in each slot.
This makes 400.000 bytes for the NSMutableArray itself.
Each NSNumber instance requires 4 bytes for the class pointer, 4 bytes
for the reference count (assuming that it is an inline ref. count -
things would be worse for an external ref. count) plus 4 bytes for the
actual integer value. We assume that the type information is implicitly
encoded as part of the concrete NSNumber subclass so that it isn't
necessary to store it explicitly.
Overall a single NSNumber instance requires 12 bytes. However, the
MacOS X memory allocator has an allocation granularity of 8 bytes. This
means that a memory block always has a real size which is a multiple of
8 bytes. Therefor it follows that each NSNumber instance actually
consumes 16 bytes of memory.
Overall this makes 400.000 + 16*100.000 = 2.000.000 bytes or roughly 2
MBytes for storing 100.000 integers.
2) Simple C style integer array:
We just need 4 bytes per integer in this case, which results in a
storage requirement of 400.000 bytes, or less than 0.5 Mbytes.
Note, that the difference between these two storage models will be
larger on a G5 as soon as MacOS X gains full support for 64bit address
spaces. There, each pointer will consume 8 bytes rather than just 4,
which means for (1): 8 bytes per NSArray slot + 8 bytes NSNumber class
pointer => 800.000 + (8+4+4)*100.000 = 800.000 + 16*100.000 = 2.400.000
or roughly 2.4 MBytes (!) vs. 0.5 MBytes for solution #2.
B) Performance overhead for element addition/access
------------------------------------------------------
1) NSMutableArray:
Adding an integer requires the creation of a new NSNumber instance.
This implies at least two ObjC message send operations (+alloc plus
-init) and a memory allocation (allocating the NSNumber instance).
Adding the new NSNumber to our NSArray requires another message send
operation (-addObject:) plus a realloc() operation every n'th integer.
Overall this results in 2*(1+1+1) = 6 subroutine calls (objc_msgSend +
method implementation) plus one call to malloc() plus one call to
realloc() every n'th integer.
Accessing an integer means sending -objectAtIndex: to the NSArray plus
-intValue to the NSNumber object: 4 subroutine calls.
2) Simple C style integer array:
Adding an integer requires a single call to realloc() in the worst case
(can be amortized over many additions with a capacity schema). Storing
the integer requires just 2 or 3 machine instructions.
Overall just one subroutine call for ever n'th integer.
Accessing an integer requires 2 or 3 machine instructions and 0
subroutine calls.
----
Another thing which should be kept in mind here is that solution #1
will strew the integers all across the process' address space, while
solution #2 keeps all integers together in one continuous address
range. This is why an algorithm which wants to process the data in
sequence has a much better chance of taking advantage of the CPU's 1st
and 2nd level caches than is the case with solution #1.
Further #2 allows us to take advantage of AltiVec instructions to
process up to four integers at once with a single machine instruction.
Also, G5 double word instructions could be used to process up to two
integers with a single machine instruction. Neither of which is
possible with #1.
In my opinion, primitive data types should never be stored in a Cocoa
collection. Those collection classes were designed for objects and are
suboptimal when it comes to storing primitive data types.
Regards,
Dietmar Planitzer
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.