re. paragraphs in unicode (was: Re: Sorting? [sorted])
re. paragraphs in unicode (was: Re: Sorting? [sorted])
- Subject: re. paragraphs in unicode (was: Re: Sorting? [sorted])
- From: has <email@hidden>
- Date: Sat, 14 Sep 2002 18:54:34 +0100
Paul Berkowitz wrote:
>
You don't need the text item delimiters line.
>
>
paragraphs of (whatever as string)
>
>
will work with CR, LF and CRLF line-endings.
[...]
>
paragraphs of whatever
>
>
is _supposed_ to work, but in fact doesn't work for Unicode text with LF or
>
CRLF line-endings (the latter gets you lines prefaced by LF, the former gets
>
you the whole text as one paragraph) as discussed here recently.
Only paying attention to CRs, obviously. <sigh> Not that I really
understand Unicode or how AS interacts with it, but I'm sure there's
supposed to be something about how things should just work... :(
>
We were assured it would be fixed. So you do need 'as string' for now.
But won't converting everything to ASCII strings defeat the point of
working in Unicode? Hum.
Workaround time: I've pulled some of the guts from everyItemLib (itself a
workaround for the much-loved string tokenisation bug) and created a
variant of the everyParagraph() handler specifically for unicode text. I
don't really know if this'll do it or not, but please give it a spin and
see. (If it works I can add it to the library proper and release it.)
======================================================================
--mark PRIVATE STUFF<B
property _LF : ASCII character 10
property _CR : ASCII character 13
property _CRLF : _CR & _LF
property _maxBlockLength : 3600
on _adjustMaxBlockLength()
set _maxBlockLength to ((_maxBlockLength) div 4) * 3
end _adjustMaxBlockLength
-------
on _tidConvert(unicodeText, fromList, toList)
set oldTID to AppleScript's text item delimiters
repeat with x from 1 to fromList's length
set AppleScript's text item delimiters to (get fromList's item
[NO-BREAK]x)
try
set tempList to unicodeText's text items
on error number -2706
set tempList to specialTextItems(unicodeText)
end try
set AppleScript's text item delimiters to (get toList's item x)
set unicodeText to tempList as Unicode text
end repeat
set AppleScript's text item delimiters to oldTID
return unicodeText
end _tidConvert
--
on specialTextItems(unicodeText)
try
set textItemCount to count unicodeText's text items
if textItemCount is less than _maxBlockLength then return
[NO-BREAK]unicodeText's text items
set theList to {}
set endLen to textItemCount mod _maxBlockLength
repeat with eachBlock from 1 to (textItemCount - endLen) by
[NO-BREAK]_maxBlockLength
set theList's end to unicodeText's text items eachBlock thru
[NO-BREAK](eachBlock + _maxBlockLength - 1)
end repeat
if endLen is not 0 then set theList's end to unicodeText's text
[NO-BREAK]items -endLen thru -1
return theList
on error number -2706
_adjustMaxBlockLength()
return specialTextItems(unicodeText)
end try
end specialTextItems
--mark -
----------------------------------------------------------------------
----------------------------------------------------------------------
--mark PUBLIC HANDLERS<B
on everyUnicodeParagraph(unicodeText)
set unicodeText to _tidConvert(unicodeText, {_CRLF, _LF}, {_CR,
[NO-BREAK]_CR})
try
set textItemCount to count unicodeText's paragraphs
if textItemCount is less than _maxBlockLength then return
[NO-BREAK]unicodeText's paragraphs
set theList to {}
set endLen to textItemCount mod _maxBlockLength
repeat with eachBlock from 1 to (textItemCount - endLen) by
[NO-BREAK]_maxBlockLength
set theList to theList & unicodeText's paragraphs eachBlock
[NO-BREAK]thru (eachBlock + _maxBlockLength - 1)
end repeat
if endLen is 0 then return theList
return theList & unicodeText's paragraphs -endLen thru -1
on error number -2706
_adjustMaxBlockLength()
return everyParagraph(unicodeText)
end try
end everyUnicodeParagraph
-------
--TEST
set unicodeText to "foo" & _CRLF & "bar" & _CRLF & "baz" as Unicode
[NO-BREAK]text --Windows style linebreaks
everyUnicodeParagraph (unicodeText)
======================================================================
[Not that any of this nonsense should ever be necessary to begin with...]
HTH
has
--
http://www.barple.pwp.blueyonder.co.uk -- The Little Page of AppleScripts
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.