• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Parsing comments from HTML...
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing comments from HTML...


  • Subject: Re: Parsing comments from HTML...
  • From: "Arthur J. Knapp" <email@hidden>
  • Date: Fri, 01 Nov 2002 15:41:43 -0500

> Subject: Parsing comments from HTML...
> Date: Fri, 1 Nov 2002 10:28:07 -0600
> From: Peter Bunn <email@hidden>

> I'm writing a script that tries to retrieve text which has been commented
> out in HTML.

> I wonder if there's a way to speed up the process?

> (I've left the HTML comment symbols out in case the list server wouldn't
> handle them properly...)

> set the_read to "
> *Cow
> Chicken
> Pig
> $
>
> *Duck
> Goose
> $"

I apoligize for being a little thick, but when you say you have left
out the HTML comment symbols, where are they supposed to go? Are you
using * and $ to represent them, or are they something different?


"<!-- Hello World -->" & return
result & "anything" & return
result & "<!-- Mello Yellow -->" & return
result & "anything" & return
result & "<!-- Foo Bar -->" & return
--
set htmlSource to result

set extractedComments to ExtractHtmlComments(htmlSource)
--
--> {" Hello World ", " Mello Yellow ", " Foo Bar "}

on ExtractHtmlComments(s)
(*
* NOTE : JavaScript code might contain "<!--" strings
* for it's own on-the-fly code-generating reasons,
* so this technique is not as robust as it could be.
*)
set o to text item delimiters --> save

set sentinal to ASCII character 1 --> unlikely to occur in s

set text item delimiters to "<!--"
set s to s's text items --> beware of approx. 4060 limit

set text item delimiters to sentinal
set s to s as string

set text item delimiters to "-->"
set s to s's text items --> beware of approx. 4060 limit

set text item delimiters to sentinal

set s to (s as string)'s text items --> beware of approx. 4060 limit

set text item delimiters to o --> restore

(* s's even items are the comments
*)
set a to {}
repeat with i from 2 to s's length by 2
set a's end to s's item i
end repeat

return a

end ExtractHtmlComments


> -->{"Cow", "Chicken", "Pig", "Duck", "Goose"}

> As an added bonus, if there's a way to sort the final list
> alphabetically, that would be of great interest also.

I don't think anyone has pluged Serge's sorting script in a while:

<http://www.applemods.com/getMod.php?script_ID=33>

:)

{ Arthur J. Knapp, of <http://www.STELLARViSIONs.com>
a r t h u r @ s t e l l a r v i s i o n s . c o m
}
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

  • Prev by Date: enumerated types: best way to fake them?
  • Next by Date: Re: Parsing comments from HTML...
  • Previous by thread: Re: Parsing comments from HTML...
  • Next by thread: Re: Parsing comments from HTML...
  • Index(es):
    • Date
    • Thread