Re: Reading large texts
Re: Reading large texts
- Subject: Re: Reading large texts
- From: has <email@hidden>
- Date: Wed, 3 Apr 2002 17:58:50 +0100
Paul Berkowitz wrote:
>
> I was under the impression that there was a size limit to a string variable
>
> (I'm reading a lot of data)... If there isn't, your suggestion would
>
> definitely be helpful. Any additional clarification on string variable size
>
> limitations?
Comparison operators (=, <, >, etc) break. The "word" and "paragraph"
keywords break. Not sure if some osax commands break; maybe some other
stuff too. This sort of stuff can generally be worked around, however (e.g.
see stringLib on my website). Apart from that, the only real issues are
memory (i.e. have you allocated plenty of it?) and, possibly, speed
(chucking around huge gobs of data may take time).
>
> Paul wrote:
>
>
>
>> Hmm. i usually read the whole text in at once, then use AppleScript's text
>
>> item delimiters {return} and {tab} as needed to get the separate items.
>
>> That
>
>> should be much faster than reading a line at a time since tids don't
>
>>require
>
>> a scripting addition (read) call.
>
>
I haven't found any problem with that. I do recall scripts here reading text
>
in 'chunks' of 30 K to avoid some sort of problem along those lines: I'm
>
sure someone will write in explaining why. Perhaps it was for an older
>
version of AS (I'm using 1.8.2b3).
Yep: would also be interested to know (I've happily read in huge stonking
text files under AS1.3.4 & AS1.3.7 without any adverse affects).
>
One thing to bear in mind, however, if you are ever tempted just to use tab
>
as delimiter without going through the separate lines (return delimiter)
>
first, is that lists formed by 'set ls to text items of largeText' have a
>
limit of about 4600 items or you get a "Stack overload" error. Here's a
>
handler that can fix that:
[...]
>
on error -- more than ~4600, too many for list
>
set s to {}
>
set a to 1
>
set z to 4000
[...]
I'm pretty sure I've seen stack overflows coercing to <4000 items on
occasion - might be wise to play safe and make the value lower, or add some
"self-correction" code like I used in everyItemLib.
>
Another is that if you are getting the lines of r first, it's quicker to
>
operate on
>
>
paragraph i of r
Yep, though with one caveat [IIRC]: the paragraph keyword is limited to
<32KB strings, so you'd need to process >32KB files in chunks if you used
this.
Cheers,
has
--
http://www.barple.connectfree.co.uk/ -- The Little Page of Beta AppleScripts
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.