I can't really answer your initial question, I suppose this depends on the definition of the character equality, and "ª" looks like "a" all right.
But if you want to reduce some text to the ascii roots, I would suggest a slightly different approach.
Since AppleScripts records are somewhat hard to reference, I have to to do this in a cumbersome way (perhaps somebody can point me to an easier and more efficient method).
You might need to extend the table of characters to handle the missing ones (I should have a table for windows latin somewhere, probably for Mac Roman as well).
Actually there might be some OSAX somewhere to do this much faster, also Perl could do this very efficiently.
property myCharRecord : "{x170:\"a\", x186:\"o\", x192:\"A\", x193:\"A\", x194:\"A\", x195:\"A\", x196:\"A\", x199:\"C\", x200:\"E\", x201:\"E\", x202:\"E\", x203:\"E\", x205:\"I\", x210:\"O\", x211:\"O\", x212:\"O\", x213:\"O\", x214:\"O\", x217:\"U\", x218:\"U\", x219:\"U\", x220:\"U\", x224:\"a\", x225:\"a\", x226:\"a\", x227:\"a\", x228:\"a\", x231:\"c\", x232:\"e\", x233:\"e\", x234:\"e\", x235:\"e\", x236:\"i\", x237:\"i\", x238:\"i\", x239:\"i\", x242:\"o\", x243:\"o\", x244:\"o\", x245:\"o\", x246:\"o\", x249:\"u\", x250:\"u\", x251:\"u\", x252:\"u\"}"
set myText to "mÀbªÁz" # or anything else
set newText to text2Ascii(myText)
return newText
on text2Ascii(myText)
set myLetters to every character in myText
set newLetters to {}
repeat with myLetter in myLetters
set myNumber to (id of myLetter)
if myNumber > 127 then
set myScript to "set myCharRecord to " & myCharRecord & return & ¬
"try" & return & ¬
"set myLetter to x" & myNumber & " of myCharRecord" & return & ¬
"on error" & return & ¬
"set myLetter to \"x" & myNumber & quote & return & ¬
"end try" & return
copy (run script myScript) to the end of newLetters
else
copy contents of myLetter to the end of newLetters
end if
end repeat
return newLetters as string
end text2Ascii
Am 04.12.2011 um 11:49 schrieb Bernardo Hoehl:
Hi friends,
I have found a problem with this code:
considering case
"ª" = "a"
end considering
On my understanding this should return "false".
But it returns "true".
What is the best way to detect characters like these?
I have been running a script for years, using the following code:
property forbidenList : every character of "ÁÂÄÀÃáâäàãÉÊËÈéêëèÍíîìïÓÔÖÒÕóôõöòÚÜÛÙúüûùçǺ"
property replaceList : every character of "AAAAAaaaaaEEEEeeeeIiiiiOOOOOoooooUUUUuuuucCo"
But this does not work for me when I have this special char.
Any shell script suggestions?
Thanks!
Bernardo Höhl
Rio de Janeiro