Re: Parsing HTML
Re: Parsing HTML
- Subject: Re: Parsing HTML
- From: Allen Watson <email@hidden>
- Date: Thu, 01 Nov 2001 16:27:46 -0800
On Wed, 31 Oct 2001 18:45:25 +0000 Steve Thompson <email@hidden> wrote:
>
Date: Wed, 31 Oct 2001 18:45:25 +0000
>
Subject: Parsing HTML
>
From: Steve Thompson <email@hidden>
>
>
<a href="xyz/xyz/xyz">PSQueue01</a>
>
>
There could be anything between zero and 37 menu items. What is the
>
easiest way to parse out the "PSQueue01" text?
>
If you can rely on no other use of the "<" and ">" characters (for example,
some people, like <me>, use them for emphasis in E-mail), then this would
work:
set t to "<a href=\"xyz/xyz/xyz\">PSQueue01</a>" & return & "<a
href=\"xyz/xyz/xyz\">PSQueue02</a>"
set savdelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to "<"
set temp to text items of t
set AppleScript's text item delimiters to ">"
set goodstuf to ""
repeat with anItem in temp
set temp2 to text items of anItem
if ((count temp2) > 1) and item 2 of temp2 is not "" then set goodstuf
to goodstuf & item 2 of temp2
end repeat
get goodstuf
-- goodstuf = "PSQueue01
PSQueue02"
That gives you the items in a string separated by returns (assuming there
were returns separating the items in the original).
If there are no returns in the original, or they are optional, you may want
to strip out any that occur, and put the items into a list instead:
set t to "<a href=\"xyz/xyz/xyz\">PSQueue01</a>" & return & "<a
href=\"xyz/xyz/xyz\">PSQueue02</a>"
set savdelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to "<"
set temp to text items of t
set AppleScript's text item delimiters to ">"
set goodstuf to {}
repeat with anItem in temp
set temp2 to text items of anItem
set x to ""
if ((count temp2) > 1) then set x to item 2 of temp2
if x is not "" and x is not return then set goodstuf to goodstuf & item
2 of temp2
end repeat
get goodstuf
Which results in {"PSQueue01", "PSQueue02"}