Re: getting chunks of text
Re: getting chunks of text
- Subject: Re: getting chunks of text
- From: Barry Wainwright <email@hidden>
- Date: Thu, 17 Mar 2005 11:58:05 +0000
On 17/3/05 9:44 am, "Andrew Oliver" <email@hidden> wrote:
> On 3/16/05 7:40 PM, "Scott Haneda" <email@hidden> wrote:
>
>> Given a string of length unknown, I need to chop it up to less than 256
>> characters, however, I do not want to chop a word in half, so scan backwards
>> to the first space before 256 chars. Not to worried about what would happen
>> when running into punctuation and such.
>>
>> I need a result set that I can repeat through, thanks.
>
> Here's one approach.
>
> It works by progressing through the text items of the source text until word
> n+1 trips the 256 character limit at which point it saves state and starts a
> new 'chunk'.
>
> It's not fast, and there are several optimization techniques I can think of
> trying, but if you're dealing with SMS messages I'm guessing you not going
> to be processing novels through this thing.
>
> Andrew
> :)
>
> set theText to "your text here"
>
> set {oldDelims, my text item delimiters} to {my text item delimiters, " "}
> set _words to text items of theText
> set chunks to {}
> set currentChunk to ""
> repeat with eachword in _words
> if (length of (currentChunk & eachword)) > 256 then
> copy currentChunk to end of chunks
> set currentChunk to eachword
> else
> set currentChunk to currentChunk & " " & eachword
> end if
> end repeat
> if currentChunk "" then copy currentChunk to end of chunks
> set my text item delimiters to oldDelims
> return chunks
>
Using 'words' will not work well with many non-alphatbetic characters.
Here's a routine that works a little faster. The code will pass you back a
list of strings, all less than the length declared, and broken at spaces in
the text. The space at the breakpoint is not passed in either the chunk
before the break or the one after - it's deliberately lost.
This URL should load the script into Script Editor. If it doesn't,
copy/paste from below, but watch for line wraps:
http://tinyurl.com/59apa
-- Set up some variables
set sliceLength to 20 -- used for testing, adjust to 255 for SMS
set messageText to "this is some text that needs to be chopped up into
smaller chunks" -- get the text any way you like - this used for testing
-- This block is the main code that collates the text chunks into a list of
-- strings broken at spaces, all less than 'sliceLength'
set outputStrings to {}
repeat until (count messageText) < sliceLength
set {firstChunk, messageText} to sliceAndDice(messageText, sliceLength)
copy firstChunk to end of outputStrings
end repeat
copy messageText to end of outputStrings
-- and this is the main routine that does all the work
on sliceAndDice(theText, cutLength)
set AppleScript's text item delimiters to {""}
set chunk1 to text 1 thru cutLength of theText
set AppleScript's text item delimiters to {" "}
set fragmentSize to count last text item of chunk1
return {text 1 thru (cutLength - fragmentSize - 1) of theText, text
(cutLength - fragmentSize + 1) thru -1 of theText}
end sliceAndDice
--
Barry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden