Re: NSXMLParser memory consumption
Re: NSXMLParser memory consumption
- Subject: Re: NSXMLParser memory consumption
- From: George King <email@hidden>
- Date: Sun, 5 Apr 2009 09:35:41 -0700
I hit a stumbling block when passing large files (multi-GB) to
NSXMLParser.
Are you doing this in 64 bit?
Yes, I switched to building x86_64 because NSXMLParser was refusing
files over 4GB.
It appears that NSXMLParser's initWithContentsOfURL: method loads
the contents of the entire file into memory, which is causing
virtual memory thrashing for at file sizes approaching my physical
RAM (2 GB in this case, so I start seeing performance issues at
around 1.3 GB). After reading the CFXMLParser documentation, I
suspect that core foundation does the same thing.
Yes, probably. Have you tried initializing it with a memory-mapped
NSData instead of an NSURL?
Thank you for the suggestion; I was unaware of
initWithContentsOfMappedFile:. This worked to a certain extent, in
that it kept memory consumption to within the bounds of available
physical memory, but it still consumed all the memory available. This
caused a good deal of thrashing when I tried running the test and
working at the same time.
Can somebody suggest an alternative API for parsing xml that does
not have memory requirements linear with file size for the
initialization? Given the event-driven design I originally imagined
that the parser would read through a file incrementally, without
loading it all into memory.
My Objective-XML might help, though I haven't tried it for files
quite that large yet. The largest I tried was a couple of hundred
MB, which worked fine. For one, it uses significantly less memory
than NSXMLParser (and is faster), trying very hard to touch as
little memory as possible and keep as little of it around as
possible. It also actually does incremental loading of URLs, though
it will detect file-URLs and then load them directly (and use
dataWithContentsOfURL:, which will likely also do a read() instead
of an mmap() ).
It has an NSXMLParser-compatible SAX API as well as a more
convenient MAX API.
Current download is at:
http://www.metaobject.com/downloads/Objective-C/Objective-XML-5.1.tgz
I just tried it with a 190 MB XML file, which took around 7s to
parse on my MacBook Pro. RPRVT stayed at 600KB the whole time,
RSHRD was also not affected. RSIZE did go to 191MB, reflecting the
fact that more and more of the mapped file's memory gets mapped into
the process in question.
Thanks for the link - I will investigate. Yesterday I got into
libxml2, and the xmlReader API provides functionality equivalent to
NSXMLParser without the memory consumption.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden