• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: encoding of file names
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: encoding of file names


  • Subject: Re: encoding of file names
  • From: Quincey Morris <email@hidden>
  • Date: Thu, 26 May 2011 23:35:48 -0700

On May 26, 2011, at 22:56, Andrew Thompson wrote:

> I believe this stems from a period in history when the unicode group believed that they'd be able to fit all practical scripts into 65536 code points. Which meant you could get away with all kinds of assumptions like 16 bit types and UCS-2.
>
> As it became clear that wasn't going to be enough code points the additional planes were defined and ucs2 fell out of favor being replaced by UTF16 which can model the higher planes.

That would explain the parting of the ways between "code unit" and "code point", but not really the distinction between "code point" and "[Unicode] character". My memory of the days when Unicode first started to get a foothold (the early 90s IIRC) is very hazy, but I think there were actually two things going on:

-- The belief, exactly as you describe, that 65536 was enough.

-- A vagueness (or perhaps a deliberate lack of definition) about what should be called a "character".

This seems to have been resolved now, and we have this hierarchy, at least in Unicode/Apple terms:

	code unit -> code point -> character -> grapheme -> (whatever the grouping is called upon which transformations like upper and lower case are performed)

It's not ultimately so hard, just a bit perilous for the unwary. That's the reason I've been going on about this ad nauseam. If we shine some light on it, we may help demystify it.

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >encoding of file names (From: Chris Idou <email@hidden>)
 >Re: encoding of file names (From: Ken Thomases <email@hidden>)
 >Re: encoding of file names (From: Ken Thomases <email@hidden>)
 >Re: encoding of file names (From: Quincey Morris <email@hidden>)
 >Re: encoding of file names (From: Ken Thomases <email@hidden>)
 >Re: encoding of file names (From: Quincey Morris <email@hidden>)
 >Re: encoding of file names (From: Andrew Thompson <email@hidden>)

  • Prev by Date: Re: encoding of file names
  • Next by Date: Re: Seeding random() randomly
  • Previous by thread: Re: encoding of file names
  • Next by thread: NSStream and NSRunModalForWindow
  • Index(es):
    • Date
    • Thread