• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: NSXML and invalid UTF8 characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NSXML and invalid UTF8 characters


  • Subject: Re: NSXML and invalid UTF8 characters
  • From: Sixten Otto <email@hidden>
  • Date: Thu, 28 Jan 2010 18:30:55 -0500

On Thu, Jan 28, 2010 at 6:16 PM, Keith Blount <email@hidden> wrote:
> I am using the NSXML classes to generate and parse my own XML files. Sometimes these files store strings of text that has been brought in from other applications (for instance, there might be a plain text representation of some text the user has pasted in from Word).

For what it's worth, another common cause of problems with stuff
pasted from Word (at least on the web), is Word docs that contain
characters from the Windows-1252 character set that are invalid UTF-8
byte sequences. Most commonly, 0x80-0x9F, which is the range where
Windows-1252 differs from ISO-Latin-1.

So whatever solution you come up with to deal with the characters
0x00-0x1F that XML specifically doesn't allow, you probably want to
also account for ranges like 0x80-0xFF that aren't valid UTF-8 at all.

http://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences
http://en.wikipedia.org/wiki/Windows-1252

Sixten
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: NSXML and invalid UTF8 characters
      • From: Keith Blount <email@hidden>
References: 
 >NSXML and invalid UTF8 characters (From: Keith Blount <email@hidden>)

  • Prev by Date: NSXML and invalid UTF8 characters
  • Next by Date: Re: Simulating drag and drop to another application - possible?
  • Previous by thread: NSXML and invalid UTF8 characters
  • Next by thread: Re: NSXML and invalid UTF8 characters
  • Index(es):
    • Date
    • Thread