• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: HTML parsing, Safari, styles, etc [was Re; Keynote, is it scriptable (XML?) -addendum]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTML parsing, Safari, styles, etc [was Re; Keynote, is it scriptable (XML?) -addendum]


  • Subject: Re: HTML parsing, Safari, styles, etc [was Re; Keynote, is it scriptable (XML?) -addendum]
  • From: has <email@hidden>
  • Date: Fri, 10 Jan 2003 14:07:52 +0000

Jean-Baptiste LE STANG wrote:

If you look the dictionary of 'Safari' you'll find a property (I don't remember its name) that returns an HTML page without all the '<>' stuff. It might help you.

tell app "Safari" to get text of document 1

Very useful for folks who just want to strip out tags while getting proper white spacing (as defined by <p>, etc. tags) and decoded character entities. Not an infrequent request, as I recall.

You can't do much else with it though. For example, you can't use it to extract tag attributes, clean up invalid markup, or dump everything into a nice AS-based object model. You need fairly direct access to an [XML/HTML] parser for that sort of stuff.

--

BTW, is it just me, or do Safari, TextEdit, etc. lose style information when you get the text of a document? I never paid much attention to styled text before, but was wanting to get text from Safari into TextEdit with styles intact. The original document's text may be styled-n-coloured pretty as a rainbow, but all this evaporates as soon as you try to do anything with it:

tell application "Safari" to get text of document 1
tell application "TextEdit" to set text of document 1 to result
--> text appears, but styling is gone


Aside:
Using an 'as record' naughty to peek inside the styled text, the ksty property always looks the same, no matter what the document's styles are:

tell application "TextEdit" to get text of document 1 as record
--> {<<class ktxt>>:"blah blah", <<class ksty>>:<<data styl0001000000000010000E00030000000C000000000000>>}


I think styled text has always been a bit cranky in AS, but this doesn't seem right to me. Bug? Feature? Or have I stupidly missed something as usual?

--

Ta,

has
--
http://www.barple.pwp.blueyonder.co.uk -- The Little Page of AppleScripts
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

References: 
 >Re:Re; Keynote, is it scriptable (XML?) -adden dum (From: "email@hidden" <email@hidden>)

  • Prev by Date: Re: AS solutions for FileMaker "email tickler"
  • Next by Date: Re: limiting numeric runs
  • Previous by thread: Re:Re; Keynote, is it scriptable (XML?) -adden dum
  • Next by thread: Scripts won't launch
  • Index(es):
    • Date
    • Thread