• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Importing/parsing CSV files
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Importing/parsing CSV files


  • Subject: Re: Importing/parsing CSV files
  • From: T&B <email@hidden>
  • Date: Wed, 13 Sep 2006 00:30:18 +1000

Following up my mention of my first script:

I also wrote another script a few months back that handles the full CSV spec (including commas and linefeeds within quotes), but it steps though reading each character, so is slow as molasses.

I tested a CSV file containing 324 records, 37 fields, with some quoted values containing linefeeds etc. My old script (character by character) took 10 seconds (on a 2GHz dual Intel) to convert it to a list of lists.

It turns out that this original script isn't as slow as I'd thought, compared to alternatives. It actually processes my test CSV text in about 5 seconds, which is only about twice the time as for my much more complicated newer script (which is too long to post here).

This script steps through the text character by character. It builds up queueText character by character until a comma delimiter is reached, then it flushes queueText as a new field value at the end of queueRow. When a return and/or linefeed is reached, it also flushes queueRow into a new row in queueTable. It uses inQuotes boolean to track whether the current character is within quotes. If inQuotes, then it just treats comma and newline as more text to add to queueText, and "" as ".

I welcome any bug reports or tweaks which speed it up. I've tried optimizing if/then tests etc but they made no difference to speed (but reduced readability).

You're welcome to use the script, just please keep my URL comment in it.

Thanks,
Tom
T&B

property quot : "\"" -- Because earlier Mac OS doesn't know quote constant
property linefeed : ASCII character 10

on CsvToList(csvText)
  -- 2006.09.12 T&B http://www.tandb.com.au/applescript/
  set queueText to ""
  set queueRow to {}
  set queueTable to {}
  set inQuotes to false
  set previousChar to ""
  set csvTextLength to length of csvText
  repeat with charN from 1 to csvTextLength
     set thisChar to character charN in csvText
     if thisChar is quot then
        if not inQuotes and previousChar is quot then
           -- double quote within quotes so actually use a quote
           set queueText to queueText & thisChar
        end if
        set inQuotes to not inQuotes
     else if inQuotes then
        set queueText to queueText & thisChar
     else if thisChar is comma then
        set end in queueRow to queueText
        set queueText to ""
     else if (thisChar is return or thisChar is linefeed) and not inQuotes then
        if previousChar is return and thisChar is linefeed then
           -- do nothing since new record already created
        else
           set end in queueRow to queueText
           set queueText to ""
           set end in queueTable to {} & queueRow
           set queueRow to {}
        end if
     else if charN is csvTextLength then
        set queueText to queueText & thisChar
        set end in queueRow to queueText
        set end in queueTable to {} & queueRow
     else
        set queueText to queueText & thisChar
     end if
     set previousChar to thisChar
  end repeat
  return queueTable
end CsvToList

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: Importing/parsing CSV files
      • From: T&B <email@hidden>
References: 
 >Re: Importing/parsing CSV files (From: T&B <email@hidden>)

  • Prev by Date: Re: Importing/parsing CSV files
  • Next by Date: question regarding file types
  • Previous by thread: Re: Importing/parsing CSV files
  • Next by thread: Re: Importing/parsing CSV files
  • Index(es):
    • Date
    • Thread