• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: extract URL from general text
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: extract URL from general text


  • Subject: Re: extract URL from general text
  • From: Skeeve <email@hidden>
  • Date: Wed, 19 Mar 2008 08:06:10 +0100

Hudson Barton wrote:
I have no interest in determining whether a string segment is a valid URL. I just want something that is the approximate equivalent of what I see in email programs, [...]
Okay, but "my" E-Mail programs thinks that these strings, which you say are invalid, contain valid URLs
The script should also NOT convert strings that are clearly invalid URL's such as:

http://glimfeather.com/borderless/>
www.glimfeather.com/borderless
http://glimfeather/borderless
http://glimfeather.com/borderless/)

Nevertheless...

I've muddled with Applescript for years, and I just can't believe that nobody has ever perfected this with vanilla Applescript or with BBedit. Sure, Perl (which I don't know) is good for parsing text, but so is Applescript in my experience.

AS is NOT good in text parsing. It's good in Search&Replace, but that's it.

Now tell me what's wrong with my AS-embedded Perl script. Shall I change it, so that it replaces the text? No problem...

set text_to_analyze to "....your text here..."

URL_to_anchor(text_to_analyze)

on URL_to_anchor(a_text)
return do shell script "perl -pe " & (quoted form of "s#(\\s)(http:\\/\\/(?:(?:(?:(?:(?:(?:(?:[a-z]|[A-Z])|[0-9])|(?:(?:(?:[a-z]|[A-Z])|[0-9])(?:(?:(?:[a-z]|[A-Z])|[0-9])|-)*(?:(?:[a-z]|[A-Z])|[0-9]))).)*(?:(?:[a-z]|[A-Z])|(?:[a-z]|[A-Z])(?:(?:(?:[a-z]|[A-Z])|[0-9])|-)*(?:(?:[a-z]|[A-Z])|[0-9])))|(?:(?:[0-9]+)\\.(?:[0-9]+)\\.(?:[0-9]+)\\.(?:[0-9]+)))(?::(?:(?:[0-9]+)))?)(?:\\/(?:(?:(?:(?:(?:(?:[a-z]|[A-Z])|[0-9]|[$-_.+]|[!*'(),])|(?:%(?:[0-9]|[A-Fa-f])(?:[0-9]|[A-Fa-f])))|;|:|@|&|=)*)(?:\\/(?:(?:(?:(?:(?:[a-z]|[A-Z])|[0-9]|[$-_.+]|[!*'(),])|(?:%(?:[0-9]|[A-Fa-f])(?:[0-9]|[A-Fa-f])))|;|:|@|&|=)*))*)(?:\\?(?:(?:(?:(?:(?:[a-z]|[A-Z])|[0-9]|[$-_.+]|[!*'(),])|(?:%(?:[0-9]|[A-Fa-f])(?:[0-9]|[A-Fa-f])))|;|:|@|&|=)*))?)?)(\\s)#$1<a href=\"$2\">$2</a>$3#g;") & " <<< " & (quoted form of a_text)
end URL_to_anchor



_______________________________________________ Do not post admin requests to the list. They will be ignored. AppleScript-Users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
  • Follow-Ups:
    • Re: extract URL from general text
      • From: Philip Aker <email@hidden>
References: 
 >Re: extract URL from general text (From: "Gary (Lists)" <email@hidden>)
 >Re: extract URL from general text (From: Hudson Barton <email@hidden>)

  • Prev by Date: Re: Toxic Soup and Enough for all
  • Next by Date: Re: extract URL from general text
  • Previous by thread: Re: extract URL from general text
  • Next by thread: Re: extract URL from general text
  • Index(es):
    • Date
    • Thread