• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: linebreak in a shell script; was Re: What's wrong with this call to zip?


  • Subject: Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
  • From: Christopher Nebel <email@hidden>
  • Date: Fri, 29 Feb 2008 00:01:12 -0800

Hmm, lots of different questions.  In no particular order:

- As explained in the Leopard release notes, "ASCII number" and "ASCII character" use your primary encoding, as determined by your primary language, which may or may not be MacRoman. It's this sort of unpredictability that led us to deprecate them and replace them with "id" addressing.

- As for the "ASCII" commands sticking around: Paul, since you asked so nice, we'll rip them out for you special. Just kidding. =) They're being left in for backward compatibility; there are no immediate plans to remove them. Move to the "id" form as time and your deployment targets allow.

- The Unicode line- and paragraph-separator characters (U+2028 and U +2029) do not currently count as "paragraph" element separators. There's a bug on that. Someone certainly *could* type them in wherever, but they're not in common use. (Yet.)

- "\r\n" is only a "character" in the AppleScript sense -- that is, it's treated as a single "character" element in text. In terms of the Unicode Standard, it's still two characters, U+000D and U+000A, but it is one "grapheme cluster", which is what AppleScript now counts as a "character". (This is mentioned in the release notes, but incorrectly: they claim it counts "combining character sequences", which are a subset of grapheme clusters.) You can read Unicode Standard Annex 29 for the complete definition, but the short version is that a grapheme cluster is something a user would consider a single logical character: this includes combining sequences, Korean syllables, and yes, a CR-LF pair.

"text item delimiters" ignore cluster boundaries, and will happily match part of a cluster, such as the "\r" in "\r\n". I'm not immediately sure if that should be considered a bug, but in the interests of backward compatibility, I suspect the answer is "no".

By the way, there is a known bug in "offset" where it still reports the offset in terms of UTF-16 code units, not clusters, so the answers you get are not necessarily useful.


--Chris Nebel AppleScript Engineering

_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
  • Follow-Ups:
    • Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
      • From: Shane Stanley <email@hidden>
References: 
 >Re: linebreak in a shell script; was Re: What's wrong with this call to zip? (From: Shane Stanley <email@hidden>)

  • Prev by Date: Re: Timing issues with Java executing AppleScript calling iTunes
  • Next by Date: Re: recursive folder count
  • Previous by thread: Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
  • Next by thread: Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
  • Index(es):
    • Date
    • Thread