Re: Regex pattern to find URLs
Re: Regex pattern to find URLs
- Subject: Re: Regex pattern to find URLs
- From: Kevin Ballard <email@hidden>
- Date: Fri, 5 Nov 2004 22:23:39 -0500
By URL do you mean one enclosed in <a href> tags or one available raw
in some body text too? If you want the former, it's not too hard, but
the latter is impossible. After all, how would it tell the difference
between the following:
(http://www.foo.com/bar.html)blah
http://www.foo.com/bar(blah).html
In the former the URL is obviously enclosed in the parens but in the
latter the parens are part of the URL (and this is indeed legal and
occasionally used, although browsers will probably percent-escape them
when clicked). No regex can possibly detect the difference in paren
handling between these two cases, and that doesn't even bring into play
other such possible URL delimiters.
On Nov 5, 2004, at 10:08 PM, Mike O'Connor wrote:
I'm in need of a robust regex pattern to locate URLs. The input is a
typical and random HTML Internet Web page. The task is to accurately
identify any URL in the HTML. Anyone happen to have one handy?
--
Kevin Ballard
email@hidden
http://www.tildesoft.com
http://kevin.sb.org
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden