• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Transliteration, TextCommands osax
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Transliteration, TextCommands osax


  • Subject: Re: Transliteration, TextCommands osax
  • From: has <email@hidden>
  • Date: Sat, 22 Nov 2008 00:19:18 +0000

Sander Tekelenburg wrote:

I still don't understand something essential though: after TextCommands'
conversion, how do you write the result to a file without undoing the
transliteration? You still need to write to file as macroman, as «class
utf8», etc. So out goes the windows encoding...


-- test:
set theFile to choose file
set input to read theFile
tell application "TextCommands"
	set throughput to convert to unicode input from "mac_roman"
	set output to convert from unicode throughput to "WINDOWS-1252"
end tell
set fileRef to open for access theFile with write permission
write output to fileRef
close access fileRef

Opening the resulting file in TextWrangler, it claims it is Mac Roman. (Same
when I ask TextCommands to convert to "latin_1", as per its documentation.)

I'm guessing you're on 10.5, yes? The problem is AS 2.0's move to Unicode-only text support; TextCommands was written pre-10.5 and uses Apple event descriptors of typeChar to represent raw text data. Bit kludgy, but works in AS 1.x as it maps typeChar descriptors directly to its 'string' type preserving the byte data as-is. Bit kludgy, but given the limitations of AppleScript it was at the time the least awful way of doing things and allows you to use AS's built-in text operators, reference forms and commands on that data.


AS 2.0 completely screws up this arrangement. Instead of mapping typeChar descriptors directly to 8-bit strings, it assumes the data they contain is encoded according to the host system's primary encoding (e.g. MacRoman) and converts them from your primary encoding to Unicode, which is what its new 'text' type uses. Which is fine if that's what you wanted, and with other applications it almost always is, but you're pretty much crocked if it isn't.

You can argue, of course, that the problem is really AppleScript's (which it is), and it's really for Apple to improve its built-in text handling facilities (or at least the standard 'read' and 'write' commands), allowing users precise control over the encoding and decoding of raw text data. Go file a feature request on this if it's important to you.

Obviously, TextCommands needs updated to represent raw text data in a format that AS 2.0 will leave intact. Two problems I have:

1. not enough free time to give all my free projects the attention they deserve

2. no formal guidelines from Apple on how to represent raw byte data.

#2 isn't a big deal - I can always make something up, though it would be nice if there was an documented standard so that all APIs that need to deal with raw data are on the same wavelength (for example, any osax or scriptable application that passes XML/HTML markup as AppleScript 'text' is doing it wrong). #1's a kicker though at the moment, unless anyone else fancies rolling up their sleeves and writing a patch themselves.

FWIW, if I were you I'd probably just write a one-line shell script to chuck your files through textutil (which has options for specifying input and/or output encodings for plain text files) and bypass AppleScript altogether.

HTH

has
--
Control AppleScriptable applications from Python, Ruby and ObjC:
http://appscript.sourceforge.net

_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
  • Follow-Ups:
    • Re: Transliteration, TextCommands osax
      • From: Sander Tekelenburg <email@hidden>
  • Prev by Date: Re: A puzzle
  • Next by Date: Re: A puzzle
  • Previous by thread: Re: Transliteration, TextCommands osax
  • Next by thread: Re: Transliteration, TextCommands osax
  • Index(es):
    • Date
    • Thread