Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue with NSXMLDocument and bad HTML



You could load it yourself and then look for & characters that aren't known entities and change them to "&".
I wouldn't recommend this approach--malformed XML is bad--but sometimes you've gotta do what you gotta do.



On Oct 18, 2005, at 12:01 PM, Tito Ciuro wrote:

Hello,

I'm trying to load an HTML page as an XML object like this:


NSURL *url = [NSURL URLWithString:@"http://wwwa.accuweather.com/ forecast.asp?zipcode=94025&partner=accuweather"]];
NSXMLDocument *xml = [[[NSXMLDocument alloc] initWithContentsOfURL:url options:NSXMLDocumentTidyXML error:anError]autorelease];




The page seems to be malformed, so the NSXMLDocument becomes nil and 'anError' contains a lot of errors:


NSError "line 78 column 26 - Warning: unescaped & or unknown entity "&city"
line 78 column 45 - Warning: unescaped & or unknown entity "&state"
line 78 column 66 - Warning: unescaped & or unknown entity "&adc_partner"
line 78 column 91 - Warning: unescaped & or unknown entity "&interest"
line 78 column 115 - Warning: unescaped & or unknown entity "&traveler"
...
<skipping lots of errors>
...
207 warnings, 42 errors were found! Not all warnings/errors were shown.


This document has errors that must be fixed before
using HTML Tidy to generate a tidied up version.



The above code used to work fine, but it looks like 'www.accuweather.com' has changed the way it outputs HTML code. I'm using NSXMLDocument because I need to extract certain elements from the document. Is there a way for NSXML to be more forgiving about these errors, or will I have to parse the document with my own code?

Thanks,

-- Tito
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/jstiles% 40blizzard.com


This email sent to email@hidden


_______________________________________________ Do not post admin requests to the list. They will be ignored. Cocoa-dev mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/email@hidden

This email sent to email@hidden
References: 
 >Issue with NSXMLDocument and bad HTML (From: Tito Ciuro <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.