Re: HTML Parsing in Objective-C?
Re: HTML Parsing in Objective-C?
- Subject: Re: HTML Parsing in Objective-C?
- From: Agent M <email@hidden>
- Date: Wed, 17 Nov 2004 18:43:26 -0500
Note that common HTML is rarely well-formed, valid XML (XHTML) which
makes parsing generic HTML with an XML parser an exercise in futility.
Of course, if the XHTML is known to be conformant, then this point is
irrelevant.
For HTML parsing, the general consensus is that Perl's HTML::Parser
takes the cake.
http://search.cpan.org/~gaas/HTML-Parser-3.38/Parser.pm
The easiest way to get this module running in a cocoa app is with the
Perl-ObjC bridge.
A second option would be to hook directly into the HTML::Parser's SGML
backend with C.
I have used HTML::Parser with great success on even really poor
non-compliant HTML.
On Nov 17, 2004, at 6:20 PM, Mont Rothstein wrote:
http://sope.opengroupware.org
Has an Object-C wrapper around libxml2 which can be used to parse HTML.
The framework has both DOM and SAX support.
The XML processing section is:
http://sope.opengroupware.org/en/sope_xml/index.html
-Mont
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
email@hidden
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden