Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Best practices for creating and comparing lists of text?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Best practices for creating and comparing lists of text?

Subject: Re: Best practices for creating and comparing lists of text?
From: has <email@hidden>
Date: Sat, 17 Dec 2005 13:22:54 +0000

CYB wrote:

>tell t starts with "#" -- Which is the objective to do this. This is the first time I see something like this.
>I understand "starts" as a containment operator (boolean) , as part of some question, but here forming a tell block, (?) I didn't catch it

I think Kai is practising for the Obfuscated AppleScript Awards. I had to refactor it into the standard AS idiom before I could read it well enough to figure out how it worked. Apart from the syntactic flummery it also uses a clever algorithm that's very fast when the source data contains mostly 'modulename' style entries (but much slower when it contains mostly 'modulename=libname' style entries), though the OP didn't specify how much data they need to crunch, how quickly it needs to be crunched, or the relative frequencies of each type of entry, so this optimisation is somewhat premature.

The ugly script object kludge is a standard workaround for AS's abysmally inefficient list item lookups (the extra referencing tricks the AS interpreter into routing around one dodgy bit of internal code by using another), and unless a list is very small then this is worth using as AS will crawl otherwise.

Here's a simpler algorithm that should be much easier to understand, modify and troubleshoot. I've used the script object kludge for obvious reasons, and it's performance is quite respectable (not much different to the average performance of Kai's more convoluted algorithm, in fact):

to parse_lists(txt)
    script k -- list access speed kludge
        property linesList : paragraphs of txt
        property moduleNames : {}
        property libNames : {}
    end script
    set tid to text item delimiters
    set text item delimiters to "="
    repeat with lineRef in k's linesList
        -- ignore blank lines and comments
        if lineRef's contents is not "" and lineRef does not start with "#" then
            set end of k's moduleNames to text item 1 of lineRef
            -- is library name the same as module name?
            if (count lineRef each text item) = 1 then
                set end of k's libNames to text item 1 of lineRef
            else
                set end of k's libNames to text item 2 of lineRef
            end if
        end if
    end repeat
    set text item delimiters to tid
    return {k's moduleNames, k's libNames}
end parse_lists

And frankly, if the performance of this code isn't good enough then it'd make far more sense to switch to a better language. AS is just plain SLOW, and resorting to hero programming to squeeze marginally better performance out of a slow language instead of just using a faster one is a fool's game. For example, here's a simple solution in Python - no speed demon itself - that's 50x faster than the AS solution above:

#!/usr/bin/python

import re

txt = """# List of module names (and library names if they differ)
Modulename1
Modulename2=Libname2
Modulename3
Modulename4=Libname4
"""

# this patter matches identifiers, but can be changed if needed
patt = re.compile('^([a-zA-Z0-9_]+)(?:=([a-zA-Z0-9_]*))?$', re.M)
lst = patt.findall(txt)
moduleNames, libNames = zip(*[(s1, s2 or s1) for s1, s2 in lst])

print moduleNames
print libNames

HTH

has
--
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." -- Brian Kernighan
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

Follow-Ups:
- Re: Best practices for creating and comparing lists of text?
  - From: Martin Orpen <email@hidden>
- Re: Best practices for creating and comparing lists of text?
  - From: Emmanuel <email@hidden>

Prev by Date: Re: Best practices for creating and comparing lists of text?
Next by Date: [ann] ParserTools 0.2.0
Previous by thread: Re: Best practices for creating and comparing lists of text?
Next by thread: Re: Best practices for creating and comparing lists of text?
Index(es):
- Date
- Thread