Re: CSV parsing (large files)
Re: CSV parsing (large files)
- Subject: Re: CSV parsing (large files)
- From: Stephen Hoffman <email@hidden>
- Date: Wed, 30 Jul 2008 10:42:18 -0400
- Organization: HoffmanLabs LLC
From: Jacob Bandes-Storch
I've got several large-size CSV files (a total of about 1.25 million
lines, and an average of maybe 10 or so columns) that I want to parse
into a 2D array. I found some parsing code that uses NSScanner, and it
works fine with small files, but it's very resource-intensive and slow
with large files. Should I try and optimize it more, or try using C
instead of Objective-C for parsing, or what?
This could be other end of the parsing (building and traversing the
array), or some sort of a leak in the parser code.
Given you're doing this in production (why else worry about
optimization?), there are open source libraries around that can deal
with various of the foibles of this format.
Try libcsv, for instance. (That C code is quite portable.)
Or extricate yourself from this particular format; CSV tends to be a
"just fixing one more bug" file format, with no widespread agreement on
escapements.
There are tools and formats which are suited for larger data sets.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden