• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Parsing HTML
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing HTML


  • Subject: Re: Parsing HTML
  • From: Allen Watson <email@hidden>
  • Date: Fri, 02 Nov 2001 08:58:41 -0800

On Fri, 2 Nov 2001 12:06:15 +0000 Steve Thompson <email@hidden> wrote:

> Under 9.1, I had an OSAX that would parse the HTML out of reams of HTML
> for me, just returning the bits I requested. This OSAX doesn't work
> under OS X and I just wanted to know if anyone had come across one that
> does.

For some reason, the migration of osaxen to OS X has been extremely slow.
Check out www.osaxen.com and you'll see there is only a handful so far.

The shareware app, TextSoap, is available in beta for OS X, and it comes
with an osax that <does> include a module to <strip> HTML code. You can find
that on Versiontracker.com.

The specific tool in the package has this in the documentation:

> HTML Text
> This cleaner will clean up HTML text. It strips out anything between <9 and
> >9. This can be useful if you have the HTML source, but just want the
> contents (without starting up your browser). It also handles Ampersand escape
> codes (&nbsp; or &#140;). It will remove tab characters and remove multiple
> carriage returns.


  • Prev by Date: Re: Search Archives! Help Full
  • Next by Date: Re: Passing variable from AppleScript to Terminal
  • Previous by thread: Re: Parsing HTML
  • Next by thread: How to react to browser window-name
  • Index(es):
    • Date
    • Thread