Re: How best to extract digits from a string?
Re: How best to extract digits from a string?
- Subject: Re: How best to extract digits from a string?
- From: "Arthur J Knapp" <email@hidden>
- Date: Sat, 28 Apr 2001 19:22:49 -0400
>
Date: Sat, 28 Apr 2001 10:21:42 -0400
>
Subject: Re: How best to extract digits from a string?
>
From: Bill Cheeseman <email@hidden>
>
Paul recognized that the typical telephone number input string contains
>
fewer non-digits than digits, so in theory it should typically be faster to
>
process the non-digits than to process the digits. (However, this is true
>
only if you can handle error conditions and the need for more elaborate
>
processing fast enough to avoid squandering the savings.)
Good points.
>
Arthur recognized that sentinel characters can sometimes be useful.
>
(However, this is only true if the overhead from handling the sentinel
>
characters does not eat up the benefits.)
Yeah, a good rule of thumb for many tid-based algorithms is to require
that the "input" string to be at least 30 to 40 some characters, (though
it obviously depends on the situation).
>
My new script is in the GetDigits1 handler in the timing script shown below.
>
It processes the input string in place, so to speak. When it encounters a
>
digit in the repeat loop, it does nothing, which is pretty fast. When it
>
encounters a non-digit in the repeat loop, it modifies the input string by,
>
in effect, deleting the non-digit on the fly (actually, it uses a temporary
>
variable to concatenate the front end and the back end of the input string).
>
Because the input string gets shorter, the script has to traverse the string
>
backwards to avoid index errors. Placing a sentinel character at the
>
beginning and the end of the input string before entering the repeat loop
>
takes little time and avoids the need to trap for out-of-range errors in the
>
repeat loop if the input string contains a non-digit in the first and/or
>
last place.
Good stuff, Bill. :)
>
to GetDigits1 from input
>
to Getdigits2 from input
You've had me really thinking about this issue now, so I tried to
pull out all of my old bag of tricks, though none seemed to provide
any significant improvements:
to ajk_GetDigits01 from input
set digits to "1234567890"
set input to every character of input -- set up for class-delete
set notstring to 0 -- remove literal from repeat loop
repeat with char in input
if char is not in digits then
set contents of char to notstring -- any class other than string
end if
end repeat
return "" & every string of input -- class-delete
end ajk_GetDigits01
property kDigits : "1234567890"
property kNotString : 0
to ajk_GetDigits02 from input
set input to every character of input
repeat with char in input
if char is not in kDigits then set contents of char to kNotString
end repeat
return "" & every string of input
end ajk_GetDigits02
to ajk_GetDigits03 from input
set input to every character of input
repeat with x from 1 to length of input
if item x of input is not in kDigits then
set item x of input to kNotString
end if
end repeat
return "" & every string of input
end ajk_GetDigits03
to ajk_GetDigits04 from input -- Recursive attempt
if item 1 of input is in kDigits then
if length of input = 1 then
return input
else
return item 1 of input & ,
(ajk_GetDigits04 from (text 2 thru -1 of input))
end if
else if length of input = 1 then
return ""
else
return (ajk_GetDigits04 from (text 2 thru -1 of input))
end if
end ajk_GetDigits04
to ajk_GetDigits05 from input -- Result var, no "set" operations
""
repeat with i in input
if i is in kDigits then
result & i
else
result
end if
end repeat
end ajk_GetDigits05
to ajk_GetDigits06 from input
""
repeat with i in ("" & every word of input) -- deletes non-word chars
if i is in kDigits then
result & i
else
result
end if
end repeat
end ajk_GetDigits06
to ajk_GetDigits07 from input -- My solution to out-of-bounds errors:
set {x, lst} to {1, {}}
try
repeat
repeat while item x of input is in kDigits
set {x, end of lst} to {x + 1, item x of input}
end repeat
set x to x + 1
end repeat
end try
return "" & lst
end ajk_GetDigits07
Arthur J. Knapp
http://www.stellarvisions.com
mailto:email@hidden
Hey, check out:
http://www.eremita.demon.co.uk/scripting/applescript/