Re: Regex pattern to find URLs
Re: Regex pattern to find URLs
- Subject: Re: Regex pattern to find URLs
- From: Kevin Ballard <email@hidden>
- Date: Sat, 6 Nov 2004 16:54:42 -0500
Uhh, that first match won't even work - your regex requires a ( at the
beginning of the string.
I do see what you mean how if you use an alternation you could possibly
get a URL surrounded by ()'s, but then what about <> and []? And then
what about (http://www.foo.com/bar(blah).html)? Humans can tell the
inner parens are for the URL but I can't imagine how a regex can.
On Nov 6, 2004, at 12:22 PM, b.bum wrote:
A single regex can handle that situation, it is just a pain to write.
Using Python's regular expressions as an example (because named
subexpressions are a lot nicer than indice based subexpressions):
>>> import re
>>> r = re.compile('^\((?P<u1>http://[^)]*)|(?P<u2>http://.*)')
>>> r.match('(http://foo.com/baz)bar').group('u1')
'http://foo.com/baz'
>>> r.match('http://foo.com/baz/bar').group('u2')
'http://foo.com/baz/bar'
The '|' -- or operator -- is the key. Ordering the expressions is
equally as important as you must have the most specific matching
expression first.
--
Kevin Ballard
email@hidden
http://www.tildesoft.com
http://kevin.sb.org
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden