Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Removing html tags




On Feb 28, 2005, at 18:22, Christian Vinaa wrote:
set theText to "<tag1>this is some text</tag1> and then there's this text followed by <tag2>and <tag3>its</tag3> contents</tag2>"
set {od, AppleScript's text item delimiters} to ¬
{AppleScript's text item delimiters, "<"}
set theText to text items of theText
set newText to ""
set AppleScript's text item delimiters to ">"
repeat with anItem in theText
set newList to text items of anItem
if (count newList) > 1 then
set newText to newText & text item 2 of newList
end if
end repeat
set AppleScript's text item delimiters to od
newText


-->"this is some text and then there's this text followed by and its contents"


havent tried it out but with a quick glance it doesnt seem to take into consideration fx. the tag

<TD class="tabelka01" rowspan="2" align="center">

only tags like  </tag1>

but  PageSpinner  have a script that does in fact remove all tags
large and small  :-))


The script above does indeed take care of whatever size tags you throw at it.
The only problem is that if you are not working with a full html file, it might choke in the text before the tag.
That is, the script above will fail in the following case:


set theText to "this <tag1>is some text</tag1> and then there's this text followed by <tag2>and <tag3>its</tag3> contents</tag2>"
--> "is some text and then there's this text followed by and its contents"


This script is fast and takes care of the problem above:
<script>
set theText to "this is <tag1>some text</tag1> and then there's this text followed by <TD class=\"tabelka01\" rowspan=\"2\" align=\"center\">and <tag3>its</tag3> contents</tag2>"
set {od, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "<"}
set {newText, theText, j, AppleScript's text item delimiters} to {text item 1 of theText, rest of text items of theText, length of rest of text items of theText, ">"}
repeat with l from 1 to j
set newText to newText & text item 2 of (item l of theText)
end repeat
set AppleScript's text item delimiters to od
newText
-->"this is some text and then there's this text followed by and its contents"
</script>


deivy
--------------------------------------------
Agora quem da bola é o Santos,
Salve o novo campeão!

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/applescript-users/email@hidden

This email sent to email@hidden
References: 
 >Re: Removing html tags (From: "Marc K. Myers" <email@hidden>)
 >Re: Removing html tags (From: Christian Vinaa <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.