Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue with NSXMLDocument and bad HTML



Hello,

I'm trying to load an HTML page as an XML object like this:

NSURL *url = [NSURL URLWithString:@"http://wwwa.accuweather.com/ forecast.asp?zipcode=94025&partner=accuweather"]];
NSXMLDocument *xml = [[[NSXMLDocument alloc] initWithContentsOfURL:url options:NSXMLDocumentTidyXML error:anError]autorelease];


The page seems to be malformed, so the NSXMLDocument becomes nil and 'anError' contains a lot of errors:

NSError "line 78 column 26 - Warning: unescaped & or unknown entity "&city"
line 78 column 45 - Warning: unescaped & or unknown entity "&state"
line 78 column 66 - Warning: unescaped & or unknown entity "&adc_partner"
line 78 column 91 - Warning: unescaped & or unknown entity "&interest"
line 78 column 115 - Warning: unescaped & or unknown entity "&traveler"
...
<skipping lots of errors>
...
207 warnings, 42 errors were found! Not all warnings/errors were shown.


This document has errors that must be fixed before
using HTML Tidy to generate a tidied up version.


The above code used to work fine, but it looks like 'www.accuweather.com' has changed the way it outputs HTML code. I'm using NSXMLDocument because I need to extract certain elements from the document. Is there a way for NSXML to be more forgiving about these errors, or will I have to parse the document with my own code?

Thanks,

-- Tito
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/email@hidden

This email sent to email@hidden


Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.