• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
curl equivalence
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

curl equivalence


  • Subject: curl equivalence
  • From: "John R." <email@hidden>
  • Date: Tue, 25 Oct 2005 15:55:28 -0400

I am glad that someone brought up the subject of curl...

I am (still) looking for the easiest and most reliable way to simply download the HTML from a website. Sounded simple enough to me, at first.

Curl would be wonderful, if I could only make it work! This is easy -- do shell script "curl http://www.apple.com";. However, I subscribe to secure databases that hate curl and automation, and I could never figure out how to get the password, user-agent, cookies, etc to work right. Besides, all that stuff is set up perfectly well already in my Safari browser. Also, curl is blind, and it is nice to be able to see visual feed back visiting the websites.

Therefore, Applescript is ideal for automating Safari to do it right. However, the problem is knowing when the HTML is ready for capture, without waiting forever++

I offer you guys the following code, which works MOST of the time (eg. a frames problem that Gary clarified on another posting.)

Questions:
     Why should it have to be this damn complicated?
     Any ideas for doing it better?

Thanks,

- John

------------------------------------------------------------------------ --------
on run
GetHTMLfromWebsite("http://www.apple.com";)
end run
------------------------------------------------------------------------ --------
on GetHTMLfromWebsite(myURL)
tell application "Safari"
set URL of document 1 to myURL
if my WindowLoadProblem(id of window 1) then error "Oops - problem accessing internet..."
return source of document 1
end tell
end GetHTMLfromWebsite
------------------------------------------------------------------------ --------
on WindowLoadProblem(winID)
--------------------------------------------
-- Step #1: Window name change signals the start of a load.
-- Note: WINDOW name is testable, but DOCUMENT name is not!
--------------------------------------------
tell application "Safari"
set the name of window id winID to "Waiting to Start!!!"
set thisName to "Untitled"
repeat until thisName does not contain "Waiting to Start!!!"
delay 0.5
set thisName to name of window id winID
end repeat
end tell
--------------------------------------------
-- Step #2: Javascript "readyState" signals the completion
-- Problem: apparently, use of HTML frames may set readyState prematurely!
--------------------------------------------
set pagefailed to true
set n to 0
repeat until (not pagefailed) or (n = 40)
delay 0.5 -- seconds delay between tries, with a timeout after 40 tries
set n to n + 1
tell application "Safari" to set r to ¬
(do JavaScript "document.readyState" in document 1 of window id winID)
if r = "complete" then
set pagefailed to false
end if
end repeat
return pagefailed
end WindowLoadProblem
------------------------------------------------------------------------ --------


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: curl equivalence
      • From: Martin Orpen <email@hidden>
    • Re: curl equivalence
      • From: Christopher Nebel <email@hidden>
    • Re: curl equivalence
      • From: Daniel Jalkut <email@hidden>
    • Re: curl equivalence
      • From: kai <email@hidden>
  • Prev by Date: Re: Producing Unicode-only characters [was: Finding \t, \r, \n reliably]
  • Next by Date: Re: Producing Unicode-only characters
  • Previous by thread: Re : Applescript-users Digest, Vol 2, Issue 708
  • Next by thread: Re: curl equivalence
  • Index(es):
    • Date
    • Thread