• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Bad Characters from Unicode
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bad Characters from Unicode


  • Subject: Re: Bad Characters from Unicode
  • From: Luther Fuller <email@hidden>
  • Date: Tue, 2 Oct 2007 13:32:35 -0500

On Oct 2, 2007, at 12:30 PM, has wrote:
1. If you don't already have some understanding of character sets, encodings, the distinction between 'bytes' and 'characters', etc. then I recommend brushing up your knowledge first as you won't get far otherwise. Here's a good place to start:

http://www.joelonsoftware.com/articles/Unicode.html

I've been looking for good articles for the last couple of days. I sat before my computer to search more when this message pops up. Downloaded and will be read soon. Thanks.


2. AppleScript's Unicode support is 1. crap, and 2. partly broken. In particular, its list-to-Unicode-text coercion is buggy and may produce truncated text. If you must do text processing in AppleScript, use TextCommands or a Unicode-aware scriptable text editor (e.g. Text Wrangler, Smile) to do the actual work. e.g. TextCommands has commands for splitting and joining text, changing case and finding and replacing.

Just moments ago, I searched AppleScript documentation for "UTF" ... nothing. And I can find nothing, absolutely nothing, about exactly how Unicode is coerced to (extended ascii) text, although I did find mention that it is done.


3. If you need to convert text between different encodings, use the appropriate tools - e.g. the command-line textutil utility, TextCommands' 'convert from unicode', 'convert to unicode' and 'stringify' commands, etc.

BTW, if you do a lot of text processing or regularly deal with non- ASCII encodings, then unless you're a sadomasochist I'd recommend finding yourself a better language than AppleScript. e.g. Python's Unicode

My text processing is mostly confined to file names and comments.

_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
References: 
 >Re: Bad Characters from Unicode (From: has <email@hidden>)

  • Prev by Date: Re: Bad Characters from Unicode
  • Next by Date: Re: Bad Characters from UnicodeŠ
  • Previous by thread: Re: Bad Characters from Unicode
  • Next by thread: Re: Bad Characters from Unicode
  • Index(es):
    • Date
    • Thread