Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Importing/parsing CSV files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Importing/parsing CSV files

Subject: Re: Importing/parsing CSV files
From: kai <email@hidden>
Date: Wed, 13 Sep 2006 05:02:07 +0100


On 13 Sep 2006, at 04:09, T&B wrote:

Following up, here are some more accurate speed measurements:
37 column x 324 row CSV file, all values in quotes:
Time   Method/script
4.6s   Tom's character by character (posted earlier)
2.3s   Tom's delimiter parsing (linefeed, comma, then quotes)
0.75s  Kai's delimiter parsing and temp ASCII 0, 1, 2 substitution
19 column x 2323 row CSV file, about 10% of values in quotes:
Time   Method/script
60s    Tom's character by character (posted earlier)
22s    Tom's delimiter parsing (linefeed, comma, then quotes)
0.65s  Kai's delimiter parsing and temp ASCII 0, 1, 2 substitution
So the speed really in phenomenal, and negates the need for me to call an external perl or python or C routine for the CSV parsing, whoohoo!
It verifies my early theory:
It seems to me that the power of AppleScript's text item delimiters is more than up to the task.
but executes a solution much faster than I had yet managed. Is the speed due to the use of a list property in a script object within the handler? Why is that so much faster?

There's little doubt that the use of a script object, as a way of referencing a list within a handler, is a major factor here - especially since the difference in performance becomes more pronounced as the length of the list increases.

Perhaps a good start to understanding the principle is the "A Reference To Operator" section described in the Applescript Language Guide (about halfway down the following page, under the heading "NOTES"): http://developer.apple.com/documentation/AppleScript/Conceptual/ AppleScriptLangGuide/AppleScript.99.html

The article explains that the speed of access to items in a particularly long list can be substantially improved by using a reference to that list - rather than by referring directly to the list itself. The precise reasons for the performance characteristics of references in this context are not generally known. It may have something to do with the short-circuiting of certain checks that AppleScript normally makes for circular references - and possibly with the way in which a list is accessed internally.

It was evidently Serge Belleudy-d'Espinose who discovered that using a script object's properties to reference a list was not only more efficient than direct access, but also faster than using "a reference to". (Global variables or script properties could also be used for similar referencing, e.g: "item n of *my* scriptProperty".)

---
kai


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Importing/parsing CSV files
From: Philip Aker <email@hidden>
Re: Importing/parsing CSV files
From: Yvan KOENIG <email@hidden>


References:  
  >Re: Importing/parsing CSV files (From: T&B <email@hidden>)
  >Re: Importing/parsing CSV files (From: kai <email@hidden>)
  >Re: Importing/parsing CSV files (From: T&B <email@hidden>)
  >Re: Importing/parsing CSV files (From: T&B <email@hidden>)




Prev by Date:
Re: Filemaker Pro 8 question

Next by Date:
Re: Importing/parsing CSV files

Previous by thread:
Re: Importing/parsing CSV files

Next by thread:
Re: Importing/parsing CSV files

Index(es):

Date
Thread