Re: HTML parsing
Re: HTML parsing
- Subject: Re: HTML parsing
- From: Philip Aker <email@hidden>
- Date: Wed, 04 Sep 2002 04:43:01 -0700
Seems to be some perl stuff at:
http://theoryx5.uwinnipeg.ca/mod_perl/cpan-search?search=html+parser
Philip Aker
http://www.aker.ca
Roger Howard wrote:
I've begun to build fairly function-specific handlers for extracting
values
from discreet HTML tag attributes and I was wondering if anyone has or
knows
of anything a bit more generic and tested. I have two main tasks:
1) Extract data in between a given start tag and an intelligently
identified
end tag. For instance, feed it the position of a <P> and it will
return all
the data between the <P> and the next </P>
2) Extract values from specified tags. For instance, feed it a tag
such as
<meta name="FIELDNAME" content="Field data inserted here"> and return
the
labels and values in the name and content fields as a hash array like:
(("name","FIELDNAME"),("content","Field data inserted here"))
A bonus would be the top-down parsing of an entire HTML document into
a tree
of tags, attributes, and values.
Given AppleScript's ignorance of HTML/XML structures, is there a
better,
more tested way of doing this? I'd hate to get into constant revisions
of my
handlers to suit additional data sets, so I'm hoping maybe there's
instead
either a tried-and-true Scripting Addition or a better way such as a
shell
tool I can trigger from AppleScript.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.