Re: Search and replace with Unix commands
Re: Search and replace with Unix commands
- Subject: Re: Search and replace with Unix commands
- From: Bruce Brown <email@hidden>
- Date: Wed, 04 Aug 2010 20:32:42 -0700
Hi Mark,
I'd first manually inspect the input file to determine what the line
endings are now. These are the three most-likely possibilities:
o LF only (Unix and some Mac OS X files)
o CR only (Traditional Mac OS; i.e., pre-OSX Mac OS files, and
some Mac OS X files)
o CR LF (MS-DOS files and MS Windows files)
Then, at the highest, or overview, level, your script would have to
do the following:
1. Normalize the line endings, to put them into standard (Unix) form.
2. Process the text. (That is, perform your search and replace
operations.)
3. Restore the line endings as they were originally. (This step might
not be needed, depending on the ultimate use of your output file.)
Step 1 above would be as follows, depending on what kind of line
endings the input file has:
o If all of the input file lines end in LF only, do nothing. (In
other words, skip step 1.)
o If all of the input file lines end in CR only, replace each CR
character with an LF character.
o If all of the input file lines end in CR followed by LF, delete
each CR character.
Step 3 above would simply reverse the actions performed by step 1:
o If all of the original (input file) lines ended in LF only, do
nothing. (In other words, skip step 3.)
o If all of the original lines ended in CR only, replace each LF
character with a CR character.
o If all of the original lines ended in CR followed by LF, insert
a CR character immediately before each LF character.
For Step 2 above, the text-processing step, with the input file now
in standard Unix line-ending form, you would then be able to use
standard Unix commands (grep [or, perhaps, sed, awk, etc.]) and
standard unix text-processing techniques to do what you need to do.
Note that saying, "The string 'John' immediately follows an LF
character," as you wrote below, is just another way of saying, "The
string 'John' starts at the beginning of the line it is on." This
means that you can use the '^' meta-character in a regular expression
to mean, "Match the string 'John' only if it is found at the
beginning of a line."
Unix commands like grep know how to interpret the ^ character to
mean, "Match, starting at the beginning of a line." (The
corresponding character for matching a string at the end of a line is
'$'.)
For changing line endings in Steps 1 and 3 above, you can use the
'tr' Unix command to translate line endings. For example:
tr -d '\015' < inputfile > outputfile
This command deletes ('-d') each byte that is Octal 15 (which is a CR
character) from the inputfile, writing its output to outputfile. This
command would, therefore, change all lines ending in CR LF to only
LF. Similar 'tr' commands can be used to insert CR characters, to
change LF characters into CR characters, or to change CR characters
into LF characters.
Finally, note that:
CR = Decimal 13 = Octal 015 = Hexadecimal 0d
LF = Decimal 10 = Octal 012 = Hexadecimal 0a
Other things to watch out for:
o High-order ASCII bit turned on, on some or all of the characters
in the file (e.g., CR is 8d instead of 0d, and so on).
o The file uses a character code that is something other than
ASCII (e.g., Unicode, or something else).
Hope this helps,
-B.
On Aug 4, 2010, at 5:08 PM, Hagimeno wrote:
Hi,
I need to search a string like '\rJohn' or '\nJohn' with another
string
All the unix command (Sed, Awk...) works one line at time and the
separator is usually LF.
How can I search and replace, not using pure AS because some text
file are really bigger?
I would like to use do shell script using cat to read the file and
some Unix command to search empty line followed by my string and
replace them with something else.
Mark
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (applescript-
email@hidden)
Help/Unsubscribe/Update your Subscription:
40mac.com
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden