• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: unicode to ascii (not MacRoman)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unicode to ascii (not MacRoman)


  • Subject: Re: unicode to ascii (not MacRoman)
  • From: "Steven D.Majewski" <email@hidden>
  • Date: Fri, 4 Mar 2005 11:12:05 -0500


On Mar 3, 2005, at 7:29 PM, Emmanuel wrote:

A reduced question could be, is there a way of coerce MacRoman to ASCII, and even with this more limited scope I don't know an answer.



It looks like 'ascii number of character' gives me the MacRoman encoding.
I could probably make a translation table for 'do shell script "tr ..." '
that would wrap MacRoman or Latin-1 to ASCII.


I gather from other notes that whether it's MacRoman or some other encoding
depends on some other settings -- but I don't see any default encoding in
the International System Prefs.


It looks like whatever goes out thru 'do shell script' is going out as utf-8.
Is this the consistent rule ( and thus has nothing to do with Terminal.app's
encoding preferences settings. ) ? [ That makes plan #1 above awkward: I may
have to write MacRoman out to a file first. ]



For me, it's your search which requires fixing, not the artist's name. I'm not a specialist of iTunes, how comes that the search doesn't like "ô"? I scripted iTunes long ago, I can't remember such problems.
I mean, anyway we are searching Unicode in a XML, it's supposed to work flawlessly.


I'm not doing a search within iTunes -- I'm getting the artist from iTunes and
constructing a URL query string in applescript to query a library database.


While trying some more experiments, I discovered that some URLs with
accented characters actually did work. However, some don't, so it would
seem to be more reliable to map all the queries to unaccented characters.
( In this case, it's a lot better for the end user to get some false matches
they can ignore than to wrongly get no matches -- they're not likely to get
more than a dozen hits even with false positives included. )


Although, I could be tripping over CDDB cataloging errors: Does anyone happen
to know the canonical spelling/encoding for Antonio Carlos Jobim ?


Looking his name up on gracenote.com, there is usually a o-circumflex:
Antônio Carlos Jobim
but when I google for "Jobim" none of the top hits use that spelling
though some of those pages use "João Gilberto". ( That name, Maria Bethânia
and a couple of other João's are some of the searches that work. )


Is gracenote/CDDB wrong perhaps ?
( And I just grabbed a bad test case because it was the first name in my
  playlist with international chars! )


-- Steve Majewski - University of Virginia Alderman Library

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: unicode to ascii (not MacRoman)
      • From: Emmanuel <email@hidden>
References: 
 >unicode to ascii (not MacRoman) (From: "Steven D.Majewski" <email@hidden>)
 >Re: unicode to ascii (not MacRoman) (From: Neil Faiman <email@hidden>)

  • Prev by Date:
  • Next by Date: Re: unicode to ascii (not MacRoman)
  • Previous by thread: Re: unicode to ascii (not MacRoman)
  • Next by thread: Re: unicode to ascii (not MacRoman)
  • Index(es):
    • Date
    • Thread