Re: URL parsing [was Re: "+" and "-" are numbers.]
Re: URL parsing [was Re: "+" and "-" are numbers.]
- Subject: Re: URL parsing [was Re: "+" and "-" are numbers.]
- From: Nigel Garvey <email@hidden>
- Date: Tue, 6 Aug 2002 23:17:52 +0100
has wrote on Tue, 6 Aug 2002 14:47:59 +0100:
>
Nigel Garvey wrote:
>
>
>>> What can you do in the area of URL parsing? ;-)
>
>>
>
>>You mean extracting URLs from a larger string? Well, it ain't easy.
>
>
>
>Here's something that needs to be developed (and optimised) by someone
>
>with more knowledge of URL protocols than myself. It only *extracts*
>
>candidate URL's. It doesn't test their validity or try to standardise
>
>their cases. One or two of the lines are quite long, but the line wraps
>
>should be obvious:
>
>
>
> on extractURLs from str
>
[...]
>
> end extractURLs
>
>
>
> set str to "This is a string containing the URL:
>
> <www.fred.com/>.
>
> It's nice, isn't it? Also:
>
> mailto:email@hidden."
>
>
>
> extractURLs from str
>
> --> {"http://www.fred.com/", "mailto:email@hidden"}
>
>
Alas, this makes various assumptions that cannot be relied on in practice:
>
the presence of a "www." substring; case; that addresses will be shown
>
neatly delimited by "<>".
Well, I said it needed to be developed. ;-) You're right about the case
assumption in the tests for "<www." and " www.". That needs to be sorted
out. Otherwise, case and "<>" enclosures are totally irrelevant in my
script. However, there's a weakness when a URL is followed by a carriage
return or with some punctuation other than a full stop:
set str to "This is a string containing the URL:
HTTP://WWW.FRED.COM/
It's nice, isn't it? Also:
?
mailto:email@hidden!"
extractURLs from str
--> {"
HTTP://WWW.FRED.COM/
", "
mailto:email@hidden!"}
That looks easy to rectify. But I'm just entering the production period
for a new show, so I won't have time to think about it seriously for a
few days. I may get back to it if no-one else has run with it by then.
NG
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.