On 25 Nov 2013, at 11:06 AM, Christopher Stone <email@hidden> wrote:
You see something wrong with my logic?
No, I see something wrong with my eyesight :-( I could have sworn _content started out as a string and you were just concatenating.
Anyway, if we can ever pry sed from your hands ;-), Mavericks gives you access to data detectors, so you could put something like this in your library instead:
on findURLsIn:theString set theDD to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value) set theFinds to theDD's matchesInString:theString options:0 |range|:{location:0, |length|:length of theString} set theFinds to theFinds as list -- so we can loop through set theResult to {} -- we will add to this repeat with i from 1 to count of items of theFinds set end of theResult to (item i of theFinds)'s |URL|()'s absoluteString() as string end repeat return theResult end findURLsIn:
Or perhaps in simpler form, returning a single string:
on findURLsIn:theString set theDD to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value) set theURLs to theDD's matchesInString:theString options:0 |range|:{location:0, |length|:length of theString} return ((theURLs's valueForKeyPath:"URL.absoluteString")'s componentsJoinedByString:return) as text end findURLsIn:
Finally: I think do shell script may be a little bit faster in Mavericks. The above script is damn near as fast as my Satimage.osax version.
So here's my script, using do script and sed, and the latter of the above methods (which is simpler but probably a smidge slower than the first, but hey):
use theLib : script "<name of lib>" use scripting additions
set theContent to "blah blah blah ho ho ho he he he " log 1 set x to do shell script "<<< " & quoted form of theContent & ¬ " tr '\\r' '\\n' | sed -En '/^https?: \\/\\/.+/p' | tr '\\n' '\\r'" without altering line endings log 2 theLib's findURLsIn:theContent log 3
And the result:
0000.001 (*1*) 0000.008 (*2*) 0000.010 (*3*)
Those values are in seconds. But don't let me dissuade you from using the slower methods ;-)
Eventually I'd like to make the script robust enough to handle encoded email and html email, but I think I'm going to need modules in Perl and haven't figured out how to load them via MacPorts yet.
Why bother? You already have all the tools you need.
|