Re: Line count of a referenced text file?
Re: Line count of a referenced text file?
- Subject: Re: Line count of a referenced text file?
- From: Matthew Broms <email@hidden>
- Date: Tue, 13 Nov 2001 15:22:57 -0400
Interesting, thanks for the thorough comments Chris. You are correct in
that other languages work in similar fashion, so I agree it's not a
hindrance specific to AppleScript and don't mean to pick on it. But since
Apple IS omnipotent, I thought they could do it anyway :-)
I'm actually doing it exactly the way you suggested. The reason for my
original question is that before performing the primary processing, I need
to gather info on each file for user feedback and progress indicators, which
is why I was hoping I could use a few basic delimiter counts on a referenced
file such as line/paragraph count. Since I don't have that luxury, I'm
repeating through large chunks of the file, doing counts based on "myDelim",
which is, of course, slower and more tedious having to track positioning.
For my primary processing, I was surprised to find that it's roughly the
same speed to set a variable to a line of text whether that line of text is
retrieved by a read from a referenced file or from a variable stored in
memory. I just assumed that it would be much faster reading from a variable
in memory verses performing a read from a referenced file, but the two
versions of my applet execute at roughly the same speed. If anything, the
referenced version may go a little faster, perhaps from not having to drag
around this huge variable. It's hard to measure though. So I'm sticking
with going line-by-line for the core processing setting "curLine" with a
"read until return" verses reading a large chunk of text into a variable and
setting "curLine" with paragraph i.
Matt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
From: Christopher Nebel <email@hidden>
>
Date: Tue, 13 Nov 2001 09:46:36 -0800
>
To: Matthew Broms <email@hidden>
>
Cc: Applescript Users <email@hidden>
>
Subject: Re: Line count of a referenced text file?
>
>
On Monday, November 12, 2001, at 01:49 PM, Matthew Broms wrote:
>
>
> Using purely AppleScript, can I get the line count of a text file using
>
> just
>
> a reference to it (NOT reading the contents into a variable)? If I
>
> can, can
>
> I get the count of any delimiter of my choosing? When I try to do a
>
> "count
>
> paragraphs", I get an error.
>
>
Just to be clear, someone somewhere is going to have to read the entire
>
file to find out how many paragraphs there are. The file system doesn't
>
store that kind of information like it does the file length, so the only
>
way to find it out is to inspect the file. Of course, you don't have to
>
have the entire file in memory all at once.
>
>
The following will work, and only holds a single line in memory at once:
>
>
set line_count to 0
>
set fp to open for access the_file
>
repeat
>
try
>
read fp until return
>
set line_count to line_count + 1
>
on error
>
-- probably ran out of lines to read.
>
exit repeat
>
end
>
end
>
close access fp
>
>
The parameter to "until" -- return in this case -- can be any
>
single-byte character. (Allowing longer delimiters is already an
>
enhancement request.)
>
>
Alternatively, you could read the file in fixed-size chunks and then
>
count the number of delimiter characters in each chunk. (Again, this
>
only works if your delimiter is one byte -- otherwise, it could be split
>
between chunks, and recognizing that case makes things much harder.)
>
>
(same as above, but replace the try body with this:)
>
read fp for 4096
>
set line_count to line_count + count_characters(the result, return)
>
>
You can set the chunk size to anything you want; in general, the higher
>
you set it, the faster the script will run, but the more memory it will
>
need. How you implement count_characters is up to you; here's one
>
solution:
>
>
to count_characters(s, c)
>
set {tid, AppleScript's text item delimiters} to {text item
>
delimiters, c}
>
set n to (count (text items of s)) - 1
>
set AppleScript's text item delimiters to tid
>
return n
>
end count_characters
>
>
As Paul Berkowitz pointed out, if you know you're counting lines, it's
>
faster to just say "count paragraphs of s".
>
>
>
--Chris Nebel
>
AppleScript Engineering
>
>
P.S.: To be fair to AppleScript, every language that I've ever heard of
>
requires you to do it like this: you have to open the file and read it
>
yourself. It's not an unusual lack.