Re: Importing/parsing CSV files
Re: Importing/parsing CSV files
- Subject: Re: Importing/parsing CSV files
- From: T&B <email@hidden>
- Date: Mon, 11 Dec 2006 15:44:36 +1100
Here's as slight correction to my previous script. It wasn't handling
text where the first item was in quotes. I fixed this by changing:
if separatedN is 1 and lineN is 1 then
to:
if separatedN is 1 and lineN is 1 and unquotedN is not 1 then
I've also added a nullReplacement argument, so I can set with what to
replace ,NULL, and ,, items. I'd usually set this to an empty string
"" if displaying the results in a table (eg in Xcode INterface
Builder) or the null class if needing to distinguish non-entries from
empty data.
Here are some times on a test file 324 rows by 37 rows. I also tried
using various lists as script object properties (rather than simple
local variables).
1.1s CSVToList -- older routine that just parses all as text
items, not classed
2.3s CSVToListClassed - script object properties: unquotedItems
2.4s CSVToListClassed - script object properties: unquotedItems,
separatedItems
2.5s CSVToListClassed - script object properties: unquotedItems,
lineItems
55s CSVToListClassed - script object properties: none
As you can see, the optimum speed is achieved by only using a script
property for the biggest one off calculated list unquotedItems. Using
script properties for the other lists, just slows it down.
I hope this helps someone else. Thanks to Kai for providing an
original script that prompted em to try a list as a script property
and a method of using text item delimiters. Thanks to Yvan for
suggesting some options to try. Any other suggestions welcome.
Here's the script:
on CsvToListClassed(csvText, separatorString, quoteString,
newLineString, nullReplacement)
set parsedList to {}
set valueList to {}
set oldDelimiters to text item delimiters
set text item delimiters to quoteString & quoteString
set textItems to text items in csvText
set text item delimiters to quoteSubstitute
set csvText to textItems as string
set text item delimiters to quoteString
script scriptObject
property unquotedItems : text items in csvText
end script
set unquotedCount to count unquotedItems of scriptObject
repeat with unquotedN from 1 to unquotedCount
set unquotedItem to item unquotedN in unquotedItems of
scriptObject
if unquotedN mod 2 is 0 then -- this item is in quotes
set text item delimiters to quoteSubstitute
set textItems to text items in unquotedItem
set text item delimiters to quoteString
set previousString to textItems as text
else -- these items aren't in quotes
set text item delimiters to newLineString
set lineItems to text items in unquotedItem
set lineCount to count lineItems
repeat with lineN from 1 to lineCount
set lineItem to item lineN in lineItems
set doEndRow to true
set text item delimiters to separatorString
set separatedItems to text items in lineItem
set separatedCount to count separatedItems
repeat with separatedN from 1 to separatedCount
set separatedItem to item separatedN in separatedItems
if separatedItem is "" then
if separatedN is 1 and lineN is 1 and unquotedN is
not 1 then -- 1st item after a quoted string
set separatedItem to previousString
else if separatedN is separatedCount and lineN is
lineCount and unquotedN is not unquotedCount then -- this item
preceeds a quoted string
set doEndRow to false
else -- just an empty between commas or is last
item in text
set separatedItem to nullReplacement
end if
else if separatedItem is "NULL" then
set separatedItem to nullReplacement
else if separatedItem is "TRUE" then
set separatedItem to true
else if separatedItem is "FALSE" then
set separatedItem to false
else if separatedItem is quoteSubstitute then
set separatedItem to ""
else -- coerce it to a number
set separatedItem to separatedItem as number
end if
if doEndRow then
set end in valueList to separatedItem
end if
set item separatedN in separatedItems to separatedItem
end repeat -- with separatedN
if doEndRow then
set end in parsedList to valueList
set valueList to {}
end if
end repeat -- with lineN
end if -- item was or wasn't quoted
end repeat -- with unquotedN
set text item delimiters to oldDelimiters
return parsedList
end CsvToListClassed
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/mailman//archives/applescript-users
This email sent to email@hidden