Re: [OT] URL terminators
Re: [OT] URL terminators
- Subject: Re: [OT] URL terminators
- From: "Arthur J. Knapp" <email@hidden>
- Date: Wed, 10 Jul 2002 16:22:58 -0400
>
Date: Wed, 10 Jul 2002 07:47:19 -0700
>
Subject: Re: [OT] URL terminators
>
From: Paul Berkowitz <email@hidden>
>
> If you are making use of regular expressions, there are lots of perl
>
> examples out there of how to extract a legitamte URL out of any kind
>
> of text, but I don't seem able to find any links at the moment.
>
> I'll try to post a URL extractor of my own sometime later today...
>
I'll look forward to that, however long "today" may take to arrive. Thanks,
>
Arthur.
Just because we can't all answer postings 2 seconds after they've been
sent... ;-)
The following grep expression works well for me in BBEdit's search
dialog, (must be version 6 or better):
((http://|mailto:|ftp:|news:)|www.)[^])>}"\x01-\x20\x7F-\xFF]+
Escaped in an AppleScript string:
"((http://|mailto:|ftp:|news:)|www.)[^])>}\"\\x01-\\x20\\x7F-\\xFF]+"
In any case, having said that I would, here is something I threw together:
property ksControlChars : run script "
set s to \"\"
repeat with i from 0 to 31
set s to s & ascii character i
end
return s"
property ksHighBitChars : run script "
set s to \"\"
repeat with i from 128 to 255
set s to s & ascii character i
end
return s"
on FindURL(str)
set r to {searchString:str, isMatch:false}
set schem to "
http://"
set x to FindOffset(str, schem)
if (x = 0) then return r
set r's isMatch to true
set r to r & {beginOffset:x}
set y to x + (schem's length) - 1
set z to str's length
ksControlChars & ksHighBitChars & space & tab & return
result & "\"()[]{}<>"
set endURL to result
considering case, diacriticals and expansion
considering hyphens, punctuation and white space
repeat until (y > z) or (str's item y is in endURL)
set y to y + 1
end repeat
set y to y - 1
end considering
end considering
return r & {endOffset:y, matchString:str's text x thru y}
end FindURL
on FindAllURLs(str)
set a to {}
set aMatch to FindURL(str)
repeat while aMatch's isMatch = true
set a's end to aMatch's matchString
set str to str's text ((aMatch's endOffset) + 1) thru -1
set aMatch to FindURL(str)
end repeat
return a
end FindAllURLs
on FindOffset(str, sub)
if (str contains sub) and (sub's length is not 0) then
set l to 1
set r to str's length
set delta to (sub's length) - 1
set omega to (sub's length) * 2
repeat while (r - l > omega)
set m to (l + r) div 2
if (str's text l thru m contains sub) then
set r to m
else
set l to m - delta
end if
end repeat
repeat until str's text l thru (l + delta) = sub
set l to l + 1
end repeat
return l
else
return 0
end if
end FindOffset
{ Arthur J. Knapp, of <
http://www.STELLARViSIONs.com>
a r t h u r @ s t e l l a r v i s i o n s . c o m
}
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.