Re: NSURL and NSRegularExpression
Re: NSURL and NSRegularExpression
- Subject: Re: NSURL and NSRegularExpression
- From: Conrad Shultz <email@hidden>
- Date: Sat, 03 Mar 2012 18:09:32 -0800
On 3/3/12 5:06 PM, R wrote:
> Thanks Conrad, much cleaner indeed. But, this does not solve my
> invalid internet URL issues.... still accepts pretty much any URL.
I don't understand what you mean by "invalid." I would expect that
NSDataDetector has been pretty thoroughly tested and will not match
syntactically invalid results. If you find otherwise, you should file a
bug report.
I suspect that the disconnect here is that *your* definition of invalid
is closer to "looks funny" than to "violates RFC 1738." I certainly
hope that Apple is going for the latter.
Let's consider the example from your original post:
http://www.cnn.thePiratesWillWinTheWorldSeries
I see absolutely nothing wrong with that URL. Sure, it doesn't have a
TLD at the end. But that has nothing to do with being valid. If you
don't believe me (and don't want to mess around with your DNS server)
add the following line to /etc/hosts:
157.166.255.18 www.cnn.thePiratesWillWinTheWorldSeries
Safari will now bring up CNN if you punch in your URL.
Here are a few other more common examples of URLs that would break if
you try to invent your own validation scheme in contravention of standards:
http://localhost
http://127.0.0.1
http://[2001:470:1f0e:a20::2]
(You *are* supporting IPv6, right?)
Now, this situation is going to be made even more complicated by the
proliferation of new gTLDs (see, e.g.,
http://en.wikipedia.org/wiki/Generic_top-level_domain). You can bet we
are soon going to see www.iphone.apple as a usable, if perhaps not
advertised, URL!
Oh, and http:// is far from the only acceptable protocol. (https://?
ftp://?)
Hopefully I have convinced you that this is a losing proposition. There
is a general rule when developing network software, or perhaps all
software: be liberal in what you accept, conservative in what you emit.
> I'm wondering how Twitter IOS, and TWTweetComposeViewController
> handles URL counting.
I don't know, but I would be upset if a valid URL wasn't recognized
properly.
As for determining whether a URL is valid, you could do a hostname
lookup (if the URL isn't an IP itself).
But then you have several issues:
1) A minor point, but (as with all operations that potentially touch the
network) you need to make sure to do this on a background thread. Host
lookups can return in an indeterminate amount of time. You would
probably want to have a timeout and then fall back to a different mechanism.
2) If you are in, say, a dual-horizon DNS environment, your hostname
might resolve in one part of a network and not in another. This would
give inconsistent results as a user moves around.
3) If a domain's authoritative DNS server is down (rare but far from
unheard of), you might temporarily get a failure condition. This would
*really* confuse and irritate the user ("sometimes the URL shortening
works, sometimes it doesn't, for the same site!").
As you can see, this is far from a straightforward problem. I'd
jettison this notion entirely and just accept anything that the
pertinent RFC(s) tell you to.
`
--
Conrad Shultz
Synthetiq Solutions
www.synthetiqsolutions.com
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden