• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)


  • Subject: Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
  • From: Malte Tancred <email@hidden>
  • Date: Thu, 5 Sep 2002 09:59:12 +0200

On wednesday, sep 4, 2002, at 15:09 Europe/Stockholm, Clark S. Cox III wrote:
No you wouldn't. There is no way that any byte in a multi-byte UTF-8
character could be confused for an ASCII character, because they always have
the high bit set. For instance, there is no way that you can make a
multi-byte UTF-8 character that looks like "%d".

I believe there is something called "overlong representation". For example, a slash (/) can be represented by for example a 3 byte UTF-8 sequence. The encoding/algorithm per se allows this.

This behavior is forbidden though, published in an extension to the original spec I think.

From what I've read, that's why it's important that any UTF-8 decoder disallows overlong sequences.

A document I've found interesting is http://www.cl.cam.ac.uk/~mgk25/unicode.html .

Anyway, I don't know if this applies to the discussion, but to me it seems that a tool that doesn't know anything about UTF-8 and there is a probability that this tool will handle data encoded into UTF-8 by someone else, some precautions should be taken by the caller.

Not in all cases perhaps, but if you talk internet server programs that may, if taken over by a malicious user, compromise the security of the server it might be a good idea?

Cheerio,
Malte
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
      • From: Chris Ridd <email@hidden>
References: 
 >Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!) (From: "Clark S. Cox III" <email@hidden>)

  • Prev by Date: NSGraphicsContext help...
  • Next by Date: Searching AddressBook
  • Previous by thread: Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
  • Next by thread: Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
  • Index(es):
    • Date
    • Thread