Re: Importing/parsing CSV files
Re: Importing/parsing CSV files
- Subject: Re: Importing/parsing CSV files
- From: T&B <email@hidden>
- Date: Fri, 08 Sep 2006 00:56:56 +1000
Thanks for the reply Mark (and Emmanuel and Yvan).
I would not advise writing your own CSV parser
The need for one just keeps coming up, and now it's unavoidable. For instance, I've needed it to import CSV data from on-line bank statements, supplier price lists and on-line shopping carts, through my scripts (which manipulate the data in other ways) into my own or clients' AppleWorks or FileMaker or MYOB databases. I also need it for a new database program I'm writing which uses SQLite, and takes data from CSV, AppleWorks and other data sources.
unless you just like a challenge for its own sake.
Although I admit that the "challenge" has consumed me beyond reason in the past couple of days, it's not the only reason I'm doing it.
The format is surprisingly tricky, with the mixture of quoted and unquoted values, embedded commas, and most especially with the possibility of embedded newlines (which Excel happily outputs).
Yes, exactly. It's the allowance for quotes, and commas and newlines (linefeeds) within those quotes that makes the parsing difficult. If it wasn't for the quotes, I could simply:
property comma : ","
property linefeed : ASCII character 10
on CsvToList(csvText)
set tableListList to {}
set oldDelimiters to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set rowList to text items in csvText
set AppleScript's text item delimiters to comma
repeat with rowItem in rowList
set end in tableListList to text items in rowItem
end repeat
set AppleScript's text item delimiters to oldDelimiters
return tableListList
end CsvToList
Many popular parsers, like Perl's Text::CSV, fail on such values (though Text::xSV handles them)
So is the latter reliable? Is it built into Mac OS X? Is there a way to tie it into a script such as using do shell script?
and the possibility changes the basic parsing model by destroying the one line == one record correspondence.
Yes, I have to allow for linefeeds (ie multiple lines) within a single value (embedded within quotes), so the one line == one record does not necessarily hold true.
If there's not already an OSAX or something
There may be one, but I don't know about it. And I would like to keep this free of OSAXen.
I would recommend using AppleScript to control an app that already understands the CSV format, like Excel or AppleWorks or Filemaker etc.
In at least one of the uses I have, I am replacing AppleWorks and FileMaker (and avoiding Excel), so I especially don't want to count on them being there.
It's funny you should mention AppleWorks, though. It's built in features don't support importing CSV. But I actually wrote a script a long time ago that does it. It even handles quotes, and commas within quotes, but not linefeeds within quotes. It is the "Import CSV" menu item, installed by the "Enhancement Pack for AppleWorks":
http://www.tandb.com.au/appleworks/import/
I also wrote another script a few months back that handles the full CSV spec (including commas and linefeeds within quotes), but it steps though reading each character, so is slow as molasses.
It seems to me that the power of AppleScript's text item delimiters is more than up to the task. Hasn't anyone created a CSV script? How have the rest of you dealt with the need to import CSV files? It seems to be a very common requirement.
Thanks,
Tom
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden