• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: en-dash and em-dash
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: en-dash and em-dash


  • Subject: Re: en-dash and em-dash
  • From: Deivy Petrescu <email@hidden>
  • Date: Mon, 20 Jun 2016 22:16:30 -0400

Well, I think I know what is going on, but I don’t understand why it happens.
I have the same kind of issues with pages in Portuguese (Spanish, French, Romanian, etc will have the same problem).
I actually tried to see if my clipboard was correct that is, that it contained the accented characters, and it did.
But writing it to a file, no matter how, garbles the characters. It does not matter what encoding I am using.
When I open the text in BBEdit these characters are Gremlins.
I actually change all the possible  Portuguese cases, e.g. á to &aacute;  and now it is fine.

 So the problem happens as we write the text to a file.
> On Jun 20, 2016, at 19:32 , Mitchell L Model <email@hidden> wrote:
>
> I let the dashes distract me. They are just the first strange characters I ran into. The problem is general.
>
> Consider this page. Its tab reads
>
> 	PEG.js – Parser Generator for JavaScript
>
> both in Safari and in Script Debugger’s inspection of that tab’s name.
>
> With each of the attempts described below the output file contains
>
> 	PEG.js Ð Parser Generator for JavaScript
>
> Or for even more fun this page, whose tab reads
>
> 	Examples overview — PyObjC — the Python ⟷ Objective-C bridge
>
> but gets written out as
>
> 		Examples overview — PyObjC — the Python ⟷ Objective-C bridge
>
>
> (1)	I append the name of the tab by concatenating to an existing string: "the name of theTab”
> (2)	I append the name of the tab by concatenating to an existing string: "the name of theTab
> 	as «class utf8»”
> (3)	I write the file with “write theText to theFile”
> (4)	I write the file with “write theText to theFile as «class utf8»
> (5)	I write the file with this handler extracted from Christopher’s code
> (6)	In desperation I thought it might be initial string to which other text was appended
> 	that had to be UTF8 rather than the way the file was written, so I initialized the string
> 	with ‘set string to “This is the title of the file” as «class utf8»'. (I never encountered «class utf8»
> 	before — and I don’t know how I was supposed to know about it — so I don’t know whether
> 	there is a UTF8 string in AppleScript.)
>
> I can’t make sense of this. I know there are encoding issues. I know how to manage them in other languages, such as Python. In all my years of AppleScript coding I never noticed characters with codes higher than MacRoman’s 256 in files written with “write theString to theFile”, but then again I have only very occasionally used AppleScript to write files, so the weird characters might have been there without my noticing.
>
> Here is Christopher’s handler I referred to:
>
>> On Jun 19, 2016, at 8:17 PM,Christopher Stone <email@hidden> wrote:
>>
>> on writeUTF8(_text, _file)
>>   try
>>      if _file starts with "~/" then
>>         set _file to POSIX path of (path to home folder as text) & text 3 thru -1 of _file
>>      end if
>>      set fRef to open for access _file with write permission
>>      set eof of fRef to 0
>>      write _text to fRef as «class utf8»
>>      close access fRef
>>   on error e number n
>>      try
>>         close access fRef
>>      on error e number n
>>         error "Error in writeUTF8() handler!" & return & return & e
>>      end try
>>   end try
>> end writeUTF8
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> AppleScript-Users mailing list      (email@hidden)
> Help/Unsubscribe/Update your Subscription:
> Archives: http://lists.apple.com/archives/applescript-users
>
> This email sent to email@hidden

Deivy Petrescu
email@hidden




 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden


References: 
 >Re: en-dash and em-dash (From: Mitchell L Model <email@hidden>)

  • Prev by Date: Re: en-dash and em-dash
  • Next by Date: What is Best Practice for Reading Files of Unknown Encoding?
  • Previous by thread: Re: en-dash and em-dash
  • Next by thread: Re: en-dash and em-dash
  • Index(es):
    • Date
    • Thread