• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Repetitive Appending of Strings
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Repetitive Appending of Strings


  • Subject: Re: Repetitive Appending of Strings
  • From: Adam P Jenkins <email@hidden>
  • Date: Tue, 12 Feb 2008 16:07:07 -0500

I agree with you for appending millions of chars. My geekiness got the better of my and I wrote a little test program to time how long various methods of building a million digit numeric string 1 char at a time took. On my computer the results were as follows:

0.42 seconds: Using NSMutableString's appendFormat:, starting with a string with 0 capacity
0.41 seconds: Using NSMutableString's appendFormat:, starting with capacity 1 million
0.032 seconds: Using NSMutableData appendBytes:length: took ~ 0.032 seconds
0.0039 seconds: Assigning directly into a char array


Notice that for NSMutableString appendFormat: it doesn't make much difference whether the string is initially created with enough capacity to hold the whole string or not, which shows that it does in fact use a smart allocation strategy like I described below to amortize the cost of growing the string.

The fact that there's an order of magnitude difference between using NSMutableData appendBytes and assigning directly into an array also show that method call cost can be noticeable when we're talking millions of invocations a second.

On the flip side though I would note that all these times are so small that it's really not worth worrying about this sort of thing in cases where the number of appends isn't so high.

I've appended my test program in case you're interested. If you create a Foundation Tool project in XCode, and paste this in as the main method, then you should be able to build and run it. On my machine, a MacBook Pro with 2.33 Ghz CPU, the output is:

String append with initial capacity 0 took 0.419497 seconds
String append with initial capacity 1000000 took 0.406551 seconds
NSMutableData appendBytes took 0.033865 seconds
Assigning into char array took 0.003881 seconds


On Feb 12, 2008, at 2:48 PM, Andrew Merenbach wrote:

Hi, Adam,

I understand what you (and others) are saying about premature optimization. My feelings were only hunches, really, which is why Shark might be a good idea. A test of the buffer idea, however -- with an NSZoneCalloc'ed buffer of unichars -- and calculating a quotient out to a million digits, yielded what was something like a three-second time for the actual calculation -- compared with ten seconds when using the -appendFormat: method.

I can't imagine a case where a person would actually want more than a thousand -- or even more than a hundred, really -- digits for their equations, but I would like to make the program capable of this, if only for completeness (now, there's also a practical limit, I suppose, on how many bytes of data an NSTextView can hold, but I don't know what that is). Using a buffer seems to speed things up immensely -- in fact, the slowest part seems now to be the actual updating of the text view.

Cheers,
	Andrew

On Feb 12, 2008, at 10:53 AM, Adam P Jenkins wrote:

I agree that the method invocation itself is probably not a big problem. However appending to a string repeatedly could be very inefficient depending on how it's implemented. In the worst case, append always makes a new internal copy of its character buffer with just the right amount of extra space at the end to hold the appended content, so building up a string by appending one character at a time becomes a O(n^2) operation. I assume this is what the original poster was worried about.

You can easily avoid this worst case behavior by using the +stringWithCapacity: method of NSMutableString, passing in the max length of your digit string, so that your string already has an internal buffer preallocated to the max number of characters the string will grow to. Then you can append one char at a time up to the internal capacity without causing any reallocate/copies to occur. This seems more straightforward to me than using an explicit char array which you have to manage and null-terminate yourself.

Also, I don't know about NSMutableString specifically, but other string classes whose internals I've examined actually use a heuristic allocation strategy to avoid this worst case append behavior, at the expense of possibly using more memory than necessary. When an append is performed that causes the internal char buffer to need to be extended, rather than just extending it only as much as necessary to hold the new characters, they double the size of the internal buffer. This optimizes for the common usage of strings, where strings are built up from many small appends. The amortized cost of building up a string from n 1 character appends then becomes O(n) rather than O(n^2).

The upshot of all this is, don't prematurely optimize. It may be just fine to use -appendFormat: repeatedly if NSMutableString uses a heuristic like I described above. And in any case you can use +stringWithCapacity: to preallocate the internal buffer if you know how big your string can grow to. So that leaves you with only method call cost to worry about.

Adam

#import <Foundation/Foundation.h>

int main (int argc, const char * argv[]) {
    NSAutoreleasePool * pool = [NSAutoreleasePool new];

NSDate *tic, *toc;
NSMutableString *theString;
int niters = 1000000;
int i;

// start the string off at 0 capacity and let appendFormat expand it
tic = [NSDate date];
theString = [NSMutableString stringWithCapacity:0];
for (i = 0; i < niters; i++)
[theString appendFormat:@"%d", (i % 10)];
toc = [NSDate date];
printf("String append with initial capacity 0 took %f seconds\n",
[toc timeIntervalSinceDate:tic]);

[pool release];
pool = [NSAutoreleasePool new];

// start the string off with enough capacity to hold the whole string, so no
// reallocations are necessary
tic = [NSDate date];
theString = [NSMutableString stringWithCapacity:niters];
for (i = 0; i < niters; i++)
[theString appendFormat:@"%d", (i % 10)];
toc = [NSDate date];
printf("String append with initial capacity %d took %f seconds\n",
niters, [toc timeIntervalSinceDate:tic]);

[pool release];
pool = [NSAutoreleasePool new];


	// try using a buffer, using appendBytes:length
	tic = [NSDate date];
	NSMutableData *buffer = [NSMutableData dataWithCapacity:niters];
	for (i = 0; i < niters; i++) {
		char ch = (i % 10) + '0';
		[buffer appendBytes:&ch length:1];
	}
	toc = [NSDate date];
	printf("NSMutableData appendBytes took %f seconds\n",
		   [toc timeIntervalSinceDate:tic]);

	[pool release];
	pool = [NSAutoreleasePool new];

	// finally, try assigning directly to a memory buffer
	tic = [NSDate date];
	buffer = [NSMutableData dataWithCapacity:niters];
	char *chars = (char*)[buffer mutableBytes];
	for (i = 0; i < niters; i++) {
		char ch = (i % 10) + '0';
		chars[i] = ch;
	}
	toc = [NSDate date];
	printf("Assigning into char array took %f seconds\n",
		   [toc timeIntervalSinceDate:tic]);

	[pool release];
    return 0;
}

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: Repetitive Appending of Strings
      • From: John Stiles <email@hidden>
References: 
 >Repetitive Appending of Strings (From: Andrew Merenbach <email@hidden>)
 >Re: Repetitive Appending of Strings (From: "Michael Ash" <email@hidden>)
 >Re: Repetitive Appending of Strings (From: Adam P Jenkins <email@hidden>)
 >Re: Repetitive Appending of Strings (From: Andrew Merenbach <email@hidden>)

  • Prev by Date: IOKit.framework - The Destruction of my Project
  • Next by Date: Re: NSString ASCII filter
  • Previous by thread: Re: Repetitive Appending of Strings
  • Next by thread: Re: Repetitive Appending of Strings
  • Index(es):
    • Date
    • Thread