Re: Am I Reinventing the Wheel? (Part I)
Re: Am I Reinventing the Wheel? (Part I)
- Subject: Re: Am I Reinventing the Wheel? (Part I)
- From: Aandi Inston <email@hidden>
- Date: Thu, 08 Jan 2015 13:27:51 +0000
I am not familiar with the API you are using, I use my own XML
generator/parser, but it may be worth nothing something about XML. XML
files are implicitly Unicode and generally UTF-8. So you cannot put an
arbitrary sequence of bytes into XML as a string. A curly quote is not in
the low Latin (<=127) range so it must be a multibyte value.
Clearly there are different API approaches possible on encoding:
- convert an input encoding to UTF-8
- accept and write UTF-8 with validation, rejecting bad UTF-8 sequences
- accept and write UTF-8 with validation, converting bad UTF-8 sequences
silently to something else
- accept and write UTF-8 without validation, potentially writing malformed
XML
Parsers have similar choices to make. But anyway, if your data is not valid
UTF-8, it would explain why you get disastrous results.
XML has no standard binary representation for anything other than Unicode
strings, so symmetric encoding/decoding of such data, following your own
invention or some extension to basic XML, is the only way. A low level XML
API cannot be expected to offer this, especially one intended to write XML
for consumption by other software.
(This is in addition to the five characters prohibited in strings because
they are XML markup).
On Thu, Jan 8, 2015 at 12:43 PM, Charles Jenkins <email@hidden> wrote:
>
>
> I'm writing data to XML. When you create a node and set its string
> contents, the node will happily accept whatever string you give and allow
> you to serialize information XML deserialization cannot then recreate. In
> my case, the string in question contained curled quotes. I could serialize
> and save the data—and if I remember correctly* the output looked good when
> I inspected the file on disk—but reading it back and deserializing it led
> to disaster! Right now I'm using NSString stringByAddingPercentEncoding:
> and having no further problems with curled quotes, but I'm sure that's a
> poor long-term solution.
>
>
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden