• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
HTML parsing
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

HTML parsing


  • Subject: HTML parsing
  • From: Roger Howard <email@hidden>
  • Date: Mon, 02 Sep 2002 13:24:41 -0700

I've begun to build fairly function-specific handlers for extracting values
from discreet HTML tag attributes and I was wondering if anyone has or knows
of anything a bit more generic and tested. I have two main tasks:

1) Extract data in between a given start tag and an intelligently identified
end tag. For instance, feed it the position of a <P> and it will return all
the data between the <P> and the next </P>
2) Extract values from specified tags. For instance, feed it a tag such as
<meta name="FIELDNAME" content="Field data inserted here"> and return the
labels and values in the name and content fields as a hash array like:
(("name","FIELDNAME"),("content","Field data inserted here"))

A bonus would be the top-down parsing of an entire HTML document into a tree
of tags, attributes, and values.

Given AppleScript's ignorance of HTML/XML structures, is there a better,
more tested way of doing this? I'd hate to get into constant revisions of my
handlers to suit additional data sets, so I'm hoping maybe there's instead
either a tried-and-true Scripting Addition or a better way such as a shell
tool I can trigger from AppleScript.

Any suggestions?

Best,

Roger Howard
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: HTML parsing
      • From: Frank Miedreich <email@hidden>
    • Re: HTML parsing
      • From: Reinhold Penner <email@hidden>
  • Prev by Date: Re: Naming Files from List Ref
  • Next by Date: Re: Naming Files from List Ref
  • Previous by thread: Re: Spring-loaded folder and folder action script
  • Next by thread: Re: HTML parsing
  • Index(es):
    • Date
    • Thread