Re: Importing/parsing CSV files
Re: Importing/parsing CSV files
- Subject: Re: Importing/parsing CSV files
- From: kai <email@hidden>
- Date: Wed, 13 Sep 2006 00:30:09 +0100
On 12 Sep 2006, at 05:42, T&B wrote:
In the absence of anyone else replying with a solution...
Late to the party, Tom - but I've had a brief look at one or two tid-
based ways to do this.
The following handler assumes that the csv data is plain text
(formatted as output from Excel):
-----------------------
on |csv as list| from t
set d to text item delimiters
set q to ASCII character 0
set p to ASCII character 1
set c to ASCII character 2
set text item delimiters to "\"\""
set t to t's text items
set text item delimiters to q
set t to t as string
set text item delimiters to "\""
script o
property l : t's text items
end script
repeat with i from 1 to count o's l by 2
set text item delimiters to ","
set t to text items of o's l's item i
set text item delimiters to c
set t to t as string
set text item delimiters to p
set o's l's item i to t's paragraphs as string
end repeat
set text item delimiters to ""
set t to o's l as string
set text item delimiters to q
set o's l to t's text items
set text item delimiters to "\""
set t to o's l as string
set text item delimiters to p
set o's l to t's text items
set text item delimiters to c
repeat with i from 1 to count o's l
set o's l's item i to text items of o's l's item i
end repeat
set text item delimiters to d
o's l
end |csv as list|
-----------------------
This next (longer and slightly slower) variation should process
either plain or Unicode text - and also checks whether the value
separator is a comma or semicolon:
-----------------------
on |csv as list| from t
set d to text item delimiters
set u to ":"
set text item delimiters to u
repeat until (count t's text items) is 1
set u to u & u
set text item delimiters to u
end repeat
set q to "q" & u
set p to "p" & u
set c to "c" & u
set text item delimiters to "\"\""
set t to t's text items
set text item delimiters to q
tell t to set t to beginning & ({""} & rest)
set text item delimiters to "\""
script o
property l : t's text items
on value_separator(e)
repeat with i from 1 to e by 2
tell l's item i to if (count) > 0 then considering case
if "," is in it then
return ","
else if ";" is in it then
return ";"
end if
end considering
end repeat
set text item delimiters to d
error "No value separator was identified."
end value_separator
end script
set e to count o's l
set s to o's value_separator(e)
repeat with i from 1 to e by 2
set text item delimiters to s
set t to text items of o's l's item i
set text item delimiters to c
tell t to set t to beginning & ({""} & rest)
set text item delimiters to p
tell t's paragraphs to set o's l's item i to beginning & ({""} & rest)
end repeat
set text item delimiters to ""
tell o's l to set t to beginning & ({""} & rest)
set text item delimiters to q
set o's l to t's text items
set text item delimiters to "\""
tell o's l to set t to beginning & ({""} & rest)
set text item delimiters to p
set o's l to t's text items
set text item delimiters to c
repeat with i from 1 to count o's l
set o's l's item i to text items of o's l's item i
end repeat
set text item delimiters to d
o's l
end |csv as list|
-----------------------
When tested here on a 1000-row csv file, these handlers generally
execute up to 40 times faster than the character-by-character method.
(Whichever method is used, processing Unicode text takes
substantially longer than plain text - although that performance
ratio is still roughly similar.) YMMV, of course. But the comparison
might be useful...
---
kai
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden