Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Parsing HTML

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing HTML

Subject: Re: Parsing HTML
From: Allen Watson <email@hidden>
Date: Fri, 02 Nov 2001 08:58:41 -0800

On Fri, 2 Nov 2001 12:06:15 +0000 Steve Thompson <email@hidden> wrote:

> Under 9.1, I had an OSAX that would parse the HTML out of reams of HTML
> for me, just returning the bits I requested. This OSAX doesn't work
> under OS X and I just wanted to know if anyone had come across one that
> does.

For some reason, the migration of osaxen to OS X has been extremely slow.
Check out www.osaxen.com and you'll see there is only a handful so far.

The shareware app, TextSoap, is available in beta for OS X, and it comes
with an osax that <does> include a module to <strip> HTML code. You can find
that on Versiontracker.com.

The specific tool in the package has this in the documentation:

> HTML Text
> This cleaner will clean up HTML text. It strips out anything between <9 and
> >9. This can be useful if you have the HTML source, but just want the
> contents (without starting up your browser). It also handles Ampersand escape
> codes (  or ). It will remove tab characters and remove multiple
> carriage returns.

Prev by Date: Re: Search Archives! Help Full
Next by Date: Re: Passing variable from AppleScript to Terminal
Previous by thread: Re: Parsing HTML
Next by thread: How to react to browser window-name
Index(es):
- Date
- Thread