Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
- Subject: Re: linebreak in a shell script; was Re: What's wrong with this call to zip?
- From: Christopher Nebel <email@hidden>
- Date: Fri, 29 Feb 2008 00:01:12 -0800
Hmm, lots of different questions. In no particular order:
- As explained in the Leopard release notes, "ASCII number" and "ASCII
character" use your primary encoding, as determined by your primary
language, which may or may not be MacRoman. It's this sort of
unpredictability that led us to deprecate them and replace them with
"id" addressing.
- As for the "ASCII" commands sticking around: Paul, since you asked
so nice, we'll rip them out for you special. Just kidding. =)
They're being left in for backward compatibility; there are no
immediate plans to remove them. Move to the "id" form as time and
your deployment targets allow.
- The Unicode line- and paragraph-separator characters (U+2028 and U
+2029) do not currently count as "paragraph" element separators.
There's a bug on that. Someone certainly *could* type them in
wherever, but they're not in common use. (Yet.)
- "\r\n" is only a "character" in the AppleScript sense -- that is,
it's treated as a single "character" element in text. In terms of the
Unicode Standard, it's still two characters, U+000D and U+000A, but it
is one "grapheme cluster", which is what AppleScript now counts as a
"character". (This is mentioned in the release notes, but
incorrectly: they claim it counts "combining character sequences",
which are a subset of grapheme clusters.) You can read Unicode
Standard Annex 29 for the complete definition, but the short version
is that a grapheme cluster is something a user would consider a single
logical character: this includes combining sequences, Korean
syllables, and yes, a CR-LF pair.
"text item delimiters" ignore cluster boundaries, and will happily
match part of a cluster, such as the "\r" in "\r\n". I'm not
immediately sure if that should be considered a bug, but in the
interests of backward compatibility, I suspect the answer is "no".
By the way, there is a known bug in "offset" where it still reports
the offset in terms of UTF-16 code units, not clusters, so the answers
you get are not necessarily useful.
--Chris Nebel
AppleScript Engineering
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden