• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: URL parsing [was Re: "+" and "-" are numbers.]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: URL parsing [was Re: "+" and "-" are numbers.]


  • Subject: Re: URL parsing [was Re: "+" and "-" are numbers.]
  • From: Nigel Garvey <email@hidden>
  • Date: Tue, 6 Aug 2002 23:17:52 +0100

has wrote on Tue, 6 Aug 2002 14:47:59 +0100:

>Nigel Garvey wrote:
>
>>>> What can you do in the area of URL parsing? ;-)
>>>
>>>You mean extracting URLs from a larger string? Well, it ain't easy.
>>
>>Here's something that needs to be developed (and optimised) by someone
>>with more knowledge of URL protocols than myself. It only *extracts*
>>candidate URL's. It doesn't test their validity or try to standardise
>>their cases. One or two of the lines are quite long, but the line wraps
>>should be obvious:
>>
>> on extractURLs from str
>[...]
>> end extractURLs
>>
>> set str to "This is a string containing the URL:
>> <www.fred.com/>.
>> It's nice, isn't it? Also:
>> mailto:email@hidden.";
>>
>> extractURLs from str
>> --> {"http://www.fred.com/";, "mailto:email@hidden"}
>
>Alas, this makes various assumptions that cannot be relied on in practice:
>the presence of a "www." substring; case; that addresses will be shown
>neatly delimited by "<>".

Well, I said it needed to be developed. ;-) You're right about the case
assumption in the tests for "<www." and " www.". That needs to be sorted
out. Otherwise, case and "<>" enclosures are totally irrelevant in my
script. However, there's a weakness when a URL is followed by a carriage
return or with some punctuation other than a full stop:

set str to "This is a string containing the URL:
HTTP://WWW.FRED.COM/
It's nice, isn't it? Also:
?mailto:email@hidden!";

extractURLs from str
--> {"HTTP://WWW.FRED.COM/
", "mailto:email@hidden!"}

That looks easy to rectify. But I'm just entering the production period
for a new show, so I won't have time to think about it seriously for a
few days. I may get back to it if no-one else has run with it by then.

NG
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

  • Prev by Date: No Subject
  • Next by Date: Re: Atomic scripts
  • Previous by thread: Re: URL parsing [was Re: "+" and "-" are numbers.]
  • Next by thread: About iso.8601 in AppleScript
  • Index(es):
    • Date
    • Thread