• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Issue with NSXMLDocument and bad HTML
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue with NSXMLDocument and bad HTML


  • Subject: Re: Issue with NSXMLDocument and bad HTML
  • From: John Stiles <email@hidden>
  • Date: Tue, 18 Oct 2005 12:20:01 -0700

You could load it yourself and then look for & characters that aren't known entities and change them to "&amp;".
I wouldn't recommend this approach--malformed XML is bad--but sometimes you've gotta do what you gotta do.



On Oct 18, 2005, at 12:01 PM, Tito Ciuro wrote:

Hello,

I'm trying to load an HTML page as an XML object like this:


NSURL *url = [NSURL URLWithString:@"http://wwwa.accuweather.com/ forecast.asp?zipcode=94025&partner=accuweather"]];
NSXMLDocument *xml = [[[NSXMLDocument alloc] initWithContentsOfURL:url options:NSXMLDocumentTidyXML error:anError]autorelease];




The page seems to be malformed, so the NSXMLDocument becomes nil and 'anError' contains a lot of errors:


NSError "line 78 column 26 - Warning: unescaped & or unknown entity "&city"
line 78 column 45 - Warning: unescaped & or unknown entity "&state"
line 78 column 66 - Warning: unescaped & or unknown entity "&adc_partner"
line 78 column 91 - Warning: unescaped & or unknown entity "&interest"
line 78 column 115 - Warning: unescaped & or unknown entity "&traveler"
...
<skipping lots of errors>
...
207 warnings, 42 errors were found! Not all warnings/errors were shown.


This document has errors that must be fixed before
using HTML Tidy to generate a tidied up version.



The above code used to work fine, but it looks like 'www.accuweather.com' has changed the way it outputs HTML code. I'm using NSXMLDocument because I need to extract certain elements from the document. Is there a way for NSXML to be more forgiving about these errors, or will I have to parse the document with my own code?

Thanks,

-- Tito
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40blizzard.com


This email sent to email@hidden


_______________________________________________ Do not post admin requests to the list. They will be ignored. Cocoa-dev mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: This email sent to email@hidden
References: 
 >Issue with NSXMLDocument and bad HTML (From: Tito Ciuro <email@hidden>)

  • Prev by Date: NSTask hangs on launch
  • Next by Date: Re: Can you float a window above iPhoto slide show?
  • Previous by thread: Issue with NSXMLDocument and bad HTML
  • Next by thread: Re: Issue with NSXMLDocument and bad HTML
  • Index(es):
    • Date
    • Thread