Re: Importing/parsing CSV files
Re: Importing/parsing CSV files
- Subject: Re: Importing/parsing CSV files
- From: T&B <email@hidden>
- Date: Thu, 7 Dec 2006 03:29:48 +1100
Hi all (especially Kai),
Following up:
Then there's the separate question of some CSV files where the
unquoted items are meant to be parsed as values (eg numbers, null,
true, false) and only the quoted items are meant to be parsed as
literals (ie strings).
For simplicity (ie avoiding escape characters like \"), in examples,
I'll use a single quote ' as the quote character.
So, for example, this raw text:
1,2,3,NULL,'5th item'
6,'7th','8th''s item',9.2,-10
11,,'','14th'
'15th','16th',17
should parse as this list of values and strings:
{{1, 2, 3, null, "5th item"},
{6, "7th", "8th's item", 9.2, -10},
{11, null, "", "14th"},
{"15th", "16th", 17}}
Notice that the parsed numbers and null are values, not strings, in
the output.
Below is a draft handler. It works, but I haven't optimized it. I've
moved the ASCII call into property declarations to save calling the
OSAX each time the handler is called. And I've made separatorString,
quoteString and newLineString to be arguments so they can be varied
with each call (and each can be more than one byte long).
So, using the above example, I run the following, which gives the
desired output.
Anyone got a better way of doing this, or an improvement?
Thanks,
Tom
set csvText to "1,2,3,NULL,'5th item'
6,'7th','8th''s item',9.2,-10
11,,'','14th'
'15th','16th',17"
set separatorString to ","
set quoteString to "'"
set newLineString to ASCII Character 10 --linefeed
CsvToListClassed(csvText, separatorString, quoteString, newLineString)
property quoteSubstitute : ASCII character 1
on CsvToListClassed(csvText, separatorString, quoteString,
newLineString)
set parsedList to {}
set valueList to {}
set oldDelimiters to text item delimiters
set text item delimiters to quoteString & quoteString
set textItems to text items in csvText
set text item delimiters to quoteSubstitute
set csvText to textItems as string
set text item delimiters to quoteString
script scriptObject
property unquotedItems : text items in csvText
end script
set unquotedCount to count unquotedItems of scriptObject
repeat with unquotedN from 1 to unquotedCount
set unquotedItem to item unquotedN in unquotedItems of
scriptObject
if unquotedN mod 2 is 0 then -- this item is in quotes
set text item delimiters to quoteSubstitute
set textItems to text items in unquotedItem
set text item delimiters to quoteString
set previousString to textItems as text
else -- these items aren't in quotes
set text item delimiters to newLineString
set lineItems to text items in unquotedItem
set lineCount to count lineItems
repeat with lineN from 1 to lineCount
set lineItem to item lineN in lineItems
set doEndRow to true
set text item delimiters to separatorString
set separatedItems to text items in lineItem
set separatedCount to count separatedItems
repeat with separatedN from 1 to separatedCount
set separatedItem to item separatedN in separatedItems
if separatedItem is "" then
if separatedN is 1 and lineN is 1 then -- 1st item
after a quoted string
set separatedItem to previousString
set previousString to null
else if separatedN is separatedCount and lineN is
lineCount and unquotedN is not unquotedCount then
set doEndRow to false
else -- just an empty between commas or is last
item in text
set separatedItem to null
end if
else if separatedItem is "NULL" then
set separatedItem to null
else if separatedItem is "TRUE" then
set separatedItem to true
else if separatedItem is "FALSE" then
set separatedItem to false
else if separatedItem is quoteSubstitute then
set separatedItem to ""
else -- coerce it to a number
set separatedItem to separatedItem as number
end if
if doEndRow then
set end in valueList to separatedItem
end if
set item separatedN in separatedItems to separatedItem
end repeat -- with separatedN
if doEndRow then
set end in parsedList to valueList
set valueList to {}
end if
end repeat -- with lineN
end if -- item was or wasn't quoted
end repeat -- with unquotedN
set text item delimiters to oldDelimiters
return parsedList
end CsvToListClassed
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/mailman//archives/applescript-users
This email sent to email@hidden