Re: Long text file manipulation--BIGGER QUESTION
Re: Long text file manipulation--BIGGER QUESTION
- Subject: Re: Long text file manipulation--BIGGER QUESTION
- From: Chris Nebel <email@hidden>
- Date: Thu, 05 Apr 2001 14:57:13 -0700
- Organization: Apple Computer, Inc.
bRaD Weston wrote:
>
Perhaps the question I should be asking then is a bit more complex. ...
It sounds like the question boils down to "how do I get all the lines of a file in random order without repeating any?" Depending on what the lines look like, this can simplify the problem tremendously. From your initial question, it sounds like you're currently doing it by moving each selected PIN in turn to the end of the file. You can then get an unused one by getting a random line from 1 to n, where n decreases by one each time.
If all the lines are the same length, you can get the same effect by swapping the data at the selected position with the last unused PIN. If the PINs were in a list, it would look something like this:
repeat with i from length of pin_list to 1 by -1
set current_item to (random number from 1 to i)
do_stuff(item current of pin_list)
tell pin_list -- exchange item current_item with item i
set temp to item i
set item i to item current_item
set item current_item to temp
end tell
end repeat
Because you're actually working with a file, you would do the swap by reading from and writing to specific offsets in the file. Since the lines are all the same length, you can calculate the offset of line n as (n-1)*length_of_line + 1 instead of scanning for line breaks. Bear in mind that if you expect to consume the entire PIN file in one run, this sort of thing is *not* faster than sucking in the entire file, because you'll actually have to read the entire file twice (in semi-random order, to boot!) for the whole process to
finish. However, if you just want ten new numbers every day, then it's a win.
If the lines are all different lengths, then it's harder. If you don't mind destroying the PIN file (you could make a copy first), then you could get no repeats by erasing lines as you use them, i.e. replace all the characters with spaces, or something else than can't be in a PIN. That makes it much harder to select a new line, though, since you have to start at a random point and then scan until you find an unerased line, wrapping around to the beginning if necessary. Depending on how empty the file is, that could be very
expensive. A better strategy might be to pad each line of the file out to a known length, and then use the fixed-length technique.
--Chris Nebel
AppleScript Engineering