• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: unicode to ascii (not MacRoman)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unicode to ascii (not MacRoman)


  • Subject: Re: unicode to ascii (not MacRoman)
  • From: Christopher Nebel <email@hidden>
  • Date: Thu, 3 Mar 2005 23:05:31 -0800

On Mar 3, 2005, at 2:40 PM, Steven D.Majewski wrote:

I've read all of the archived messages I could find looking for a way to coerce the unicode text I'm getting from iTunes into plain ASCII.

After none of the many methods listed worked for me, I finally figured out from Chris Nebel's message:
http://lists.apple.com/archives/applescript-implementors/2004/Feb/ msg00039.html


that when folks in those threads were talking about ASCII plain text, what they *really* meant was MacRoman (or some other extended encoding) 8-bit chars.

IS there any way within applescript (other than calling out to a python or perl script with 'do shell script' ) to coerce a unicode string into ASCII ?

I have an iTunes script which gets the artist of the current selection ( or playing tune ) and does a lookup at our music library, but the search doesn't like (for example) the o-with-circumflex in "Antônio Carlos Jobim"

If your lookup wants "Antonio", not "Antônio", then what you're asking isn't really how to transcode Unicode into ASCII, it's how to remove (or possibly ignore) diacritical marks. Simply transcoding to ASCII would tend to get you "Ant?nio", which probably won't work either. It would help if we knew what this second system is -- it's not iTunes, is it?


To answer the immediate question, there's no way to remove diacritics in stock AppleScript, though you can tell AppleScript to ignore them in text comparisons using an "ignoring diacriticals" block. In fact, I'm not sure how you'd do it at all -- I'm pretty sure there's a system API to do it somewhere, but I can't recall. The best I can suggest at the moment is to use Perl's Unicode support to transcode the string to Normalization Form D and then remove all the combining characters.



--Chris Nebel
AppleScript Engineering

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >unicode to ascii (not MacRoman) (From: "Steven D.Majewski" <email@hidden>)

  • Prev by Date: Re: FTP-delete - variation ?
  • Next by Date: Re: get current system volume
  • Previous by thread: Re: unicode to ascii (not MacRoman)
  • Next by thread: Re: unicode to ascii (not MacRoman)
  • Index(es):
    • Date
    • Thread