------------------------------------------------------------------------------------------------
# Change Handler -- Satimage.osax MODIFIED 2010-09-25 : 02:44
------------------------------------------------------------------------------------------------
on cng(findText, changeText, textSource)
change findText into changeText in textSource ¬
with regexp without case sensitive
end cng
------------------------------------------------------------------------------------------------
set theFile to alias "Thor:Users:chris:Downloads:test_strip.txt"
set theText to paragraphs of (read theFile)
set newText to {}
repeat with i in theText
set end of newText to cng("^[\\t ]+", "", i)
end repeat
set AppleScript's text item delimiters to return
set newText to newText as text
------------------------------------------------------------------------------------------------
Note that I'd never loop through 5000 paragraphs unless there was a very good reason.
The Satimage.osax will operate on literal text, and it will also take a file as input to edit in place.
------------------------------------------------------------------------------------------------
# Change Handler -- Satimage.osax MODIFIED 2010-09-25 : 12:44
------------------------------------------------------------------------------------------------
on cng(findText, changeText, textSource)
change findText into changeText in textSource ¬
with regexp without case sensitive
end cng
------------------------------------------------------------------------------------------------
This edits the 5000 line file in place in 0.025 Seconds:
set theFile to alias "Thor:Users:chris:Downloads:test_strip.txt"
cng("^[\\t ]+", "", theFile)
------------------------------------------------------------------------------------------------
This runs in just about the same time:
set theFile to alias "Thor:Users:chris:Downloads:test_strip.txt"
set theFileText to read theFile
set theFileText to cng("^[\\t ]+", "", theFileText)
------------------------------------------------------------------------------------------------
Here's you basic handler building on the 'cng' handler.
on trimLeadingWhiteSpace(inputSource)
cng("^[\\t ]+", "", inputSource)
return result
end trimLeadingWhiteSpace
------------------------------------------------------------------------------------------------
Mark's shell scripts work just fine on the whole literal text, although I had to adjust his sed syntax a little bit. (Again the 5000 line test file.)
on stripLeadingWhiteSpaceRuby(str)
do shell script "ruby -pe '$_.lstrip!' <<<" & (quoted form of str)
end stripLeadingWhiteSpaceRuby
on stripLeadingWhiteSpacePerl(str)
do shell script "perl -pe 's/^\\s+//' <<<" & (quoted form of str)
end stripLeadingWhiteSpacePerl
on stripLeadingWhiteSpaceSed(str)
do shell script "sed -nE 's/^[ ]+//p' <<<" & (quoted form of str)
end stripLeadingWhiteSpaceSed
set str to read alias "Thor:Users:chris:Downloads:test_strip.txt"
stripLeadingWhiteSpaceRuby(str) --> 0.045
stripLeadingWhiteSpacePerl(str) --> 0.045
stripLeadingWhiteSpaceSed(str) --> 0.043
------------------------------------------------------------------------------------------------
Note that I'm considering whitespace to be spaces and tabs and ignoring linefeeds and carriage returns.
Easy enough to strip out empty lines as well.
--
Best Regards,
Chris