NSXMLParser and entity resolving (OS X 10.6.6)
NSXMLParser and entity resolving (OS X 10.6.6)
- Subject: NSXMLParser and entity resolving (OS X 10.6.6)
- From: Greg Anderson <email@hidden>
- Date: Fri, 04 Mar 2011 15:47:42 -0600
After spending a good deal of time with Uncle Google and some of his friends, I still have unanswered problems using NSXMLParser with an XML file that has entities in it. It appears that most of my questions have been asked over the last few years, but I haven't found any postings about a resolution to them. I'm hoping that someone can enlighten me as to whether I'm doing something wrong, or there's some unresolved bug in the class.
The XML files I'm working with have HTML-like entities, such as and " in them. Without any changes to my XML delegate, this chokes it with the expected NSXMLParserUndeclaredEntityError (26) as I try to parse them. For example's sake, let's say I have this XML file:
<?xml version="1.0" encoding="utf-8"?>
<important_data>
<faq>
<question>Why are there so many blank spaces?</question>
<answer>Because I love to use in addition to regular spaces</answer>
</faq>
</important_data>
With this, I get my error #26 after the first entity. My first attempt to remedy this was to implement -parser:resolveExternalEntityName:systemID: in my delegate object. This is successfully called, and I return (for these testing purposes) an NSData object to contain   (using [@" " dataUsingEncoding:NSUTF8StringEncoding] to build the data).
Looking in my -parser:foundCharacters: method I can see that this value gets passed back and the characters are "found" by the parser. But immediately after that I still get the undeclared entity error, and parsing fails. I tried this both with the parser set to resolve external entities and without, with the same results.
I next looked to setting up the entities in a DTD, and references that directly in the XML file. I borrowed a DTD I found online that defined the HTML entities I needed (basically one that included for this test) and added the necessary lines in the XML to have it included. This had 3 effects:
1. I got notifications in my delegate for the foundInternalEntityDeclarationWithName notification.
2. My delegate stopped having its resolveExternalEntityName method called (not surprising given the fact that the parser was telling me that it was finding internal declarations).
3. I no longer got parse errors.
BUT, nothing is being substituted in for the entities. I even went so far as to set up the DTD so that nbsp mapped to "a big test string", and confirmed that no substitutions were happening.
My current stopgap is to do string replacements for the entities that I know are showing up. I would prefer to let the parser handle it, rather than have to look for and replace hundreds of HTML entities that most likely don't exist, but could. I'd greatly appreciate any help on resolving this; it seems like such a basic task that it shouldn't be a bug in NSXMLParser, but I can't see where I'm doing things wrong. Hopefully I've given enough information for someone to chip in and give me a hand.
Thanks in advance.
Greg
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden