• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: HTML Parsing in Objective-C?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTML Parsing in Objective-C?


  • Subject: Re: HTML Parsing in Objective-C?
  • From: Agent M <email@hidden>
  • Date: Wed, 17 Nov 2004 18:43:26 -0500

Note that common HTML is rarely well-formed, valid XML (XHTML) which makes parsing generic HTML with an XML parser an exercise in futility. Of course, if the XHTML is known to be conformant, then this point is irrelevant.

For HTML parsing, the general consensus is that Perl's HTML::Parser takes the cake.
http://search.cpan.org/~gaas/HTML-Parser-3.38/Parser.pm


The easiest way to get this module running in a cocoa app is with the Perl-ObjC bridge.

A second option would be to hook directly into the HTML::Parser's SGML backend with C.

I have used HTML::Parser with great success on even really poor non-compliant HTML.

On Nov 17, 2004, at 6:20 PM, Mont Rothstein wrote:

http://sope.opengroupware.org

Has an Object-C wrapper around libxml2 which can be used to parse HTML.

The framework has both DOM and SAX support.

The XML processing section is:

http://sope.opengroupware.org/en/sope_xml/index.html

-Mont

¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ AgentM email@hidden ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >HTML Parsing in Objective-C? (From: Rams <email@hidden>)
 >Re: HTML Parsing in Objective-C? (From: Mark Patterson <email@hidden>)
 >Re: HTML Parsing in Objective-C? (From: Mont Rothstein <email@hidden>)

  • Prev by Date: RE: Multi-Page TIFF routine
  • Next by Date: RE: HTML Parsing in Objective-C?
  • Previous by thread: Re: HTML Parsing in Objective-C?
  • Next by thread: RE: HTML Parsing in Objective-C?
  • Index(es):
    • Date
    • Thread