Re: NSXLMDocument and malformed XML
Re: NSXLMDocument and malformed XML
- Subject: Re: NSXLMDocument and malformed XML
- From: Chris Gregg <email@hidden>
- Date: Thu, 25 May 2006 22:19:44 -0400
- Thread-topic: NSXLMDocument and malformed XML
Thanks for the detailed XML/HTML history, and I now see why fatal errors are
important. It's too bad that Apple's iWeb XML (hidden in a package and
gzip'd, I'll admit...) doesn't seem to adhere to the standard. Well, at
least all my self-written XML parsing code wasn't typed in vain. Now I just
have to deal with the ampersand escape parsing...cheers!
-Chris
On 5/25/06 8:31 PM, "Greg Herlihy" <email@hidden> wrote:
> An XML parser is not allowed to perform any type of error recovery after
> detecting that the XML document it is parsing is malformed. Rather the
> parser must notify the application of the error and must stop parsing the
> document at that point (though the parser is allowed to search the remainder
> of the document for possible, additional errors and report them as well.)
>
> So the rule is that unless an XML document is well-formed - the application
> should simply reject it. And the reason is simple: there are no standards
> for interpreting broken XML. Allowing error recovery for malformed XML would
> therefore lead to behavior both unpredictable and incompatible with every
> one else's unpredictable and incompatible behavior in the same situation.
> And in order for XML to live up to its promise as an interchangeable and
> universally-understood data format, then all XML documents - in order to be
> XML documents - must be well-formed from the start. Any document that looks
> like XML - but isn't - should be placed in the garbage.
>
> The state of HTML serves as a good example of what can go wrong when
> applications try to accommodate malformed documents. One of contributing
> factors to the original Netscape browser's early success, was that it was
> very forgiving of broken HTML. And since many web sites at the time had
> broken HTML, the Netscape browser - by masking the errors - appeared to be
> the more capable browser. In other words, a webmaster could simply recommend
> that visitors use the Netscape browser because it would show the web site
> more-or-less as intended.
>
> Exactly how Netscape Navigator rendered broken HTML was of course a behavior
> unique to Navigator . And once Navigator had become the dominant browser,
> other web browsers were then placed at a significant disadvantage: other
> browsers had to figure out how to display broken HTML just like Navigator
> did - which is a supremely difficult task of reverse-engineering a nearly
> infinite number of possible HTML errors. And attempting to take the
> alternate route: persuading webmasters that the HTML on their site was
> broken was also a tough sell - since how it could the HTML be "broken" if
> Navigator displays the page OK, and why would the webmaster want to change
> it anyway - since practically everyone is using Netscape and is not affected
> by errors in HTML in any case.
>
> The XML (and XHTML) philosophy avoids this entire debacle. All XML parsers
> adhere only to standard, defined behavior. An application generating or
> parsing XML documents pledges not to go off on its own - and pretend to
> understand a document that it does not. But it is exactly this tough-minded,
> intolerant approach that really serves the user's best interests when the
> entire picture is considered as a whole.
>
> Greg
>
>
> On 5/25/06 5:22 AM, "Chris Gregg" <email@hidden> wrote:
>
>> Disclaimer: I'm still a newbie with Cocoa, and I'm slowly stumbling
>> onto new ways to do things more simply.
>>
>> I wrote a little program that reads iWeb index.xml files (after
>> unzipping them in their original index.xml.gz), and I originally wrote
>> a minimalistic XML parser to get at the important bits of code I was
>> looking for, from the NSString I loaded from the file.
>>
>> But then I stumbled onto XMLDocument, which, I thought, would make
>> my life much easier by loading the XML for me. Excellent. The
>> problem is that the XML in some iWeb index.xml files is malformed, and
>> I'm getting the following error back from initWithContentsOfURL:
>> options: error:
>>
>> "Line 2: Namespace prefix xsi for type on color is not defined"
>>
>> It would be nice if initWithContentsOfURL at least loaded in what it
>> could and then I could ignore the malformed parts, but I guess that's
>> not the way it works, as it just returns nil and the error.
>>
>> Am I back to square one, where I need to just beef up my own XML
>> parser to take care of this, or can I gracefully recover from the
>> error and load it anyway?
>>
>> Thanks!
>>
>> -Chris Gregg
>> _______________________________________________
>> Do not post admin requests to the list. They will be ignored.
>> Cocoa-dev mailing list (email@hidden)
>> Help/Unsubscribe/Update your Subscription:
>>
>> This email sent to email@hidden
>
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden