Re: How Do I Get the html Source for a Web Page into a String?
Re: How Do I Get the html Source for a Web Page into a String?
- Subject: Re: How Do I Get the html Source for a Web Page into a String?
- From: "S. J. Cunningham" <email@hidden>
- Date: Wed, 3 Nov 2010 16:57:17 -0400
I think I figured it out. I have to set/save the proper cookies.
On Nov 2, 2010, at 4:52 PM, Roger Howard wrote:
>
> On Nov 2, 2010, at 1:23 PM, S. J. Cunningham wrote:
>
>> Thanks, but I am looking for the html source. (inb4 use python, perl, etc, etc instead of Applescript for this). I know I can get it via Safari, but I use Firefox. Besides, I would prefer not to launch another application just to get this. Right now, Steve Thompson's "do script curl" looks like the best bet if I can figure out how to get it to work with php requests.
>
>
> A few thoughts:
>
> 1. Are you looking to get the generated source that's sent to the client? If so, do you know for sure that the PHP isn't doing a client check and tailoring its source depending on which browser is being used? If that's the case, then you'll need to set the user agent in curl.
>
> 2. Is there any authentication required to hit the URLs you're testing? In other words, if you fire up Firefox, clear all caches, can you hit the URL and get the results you expect?
>
> 3. What's the response you're getting if you try curl manually from the command line? Can you post the command you're using? A simple "curl <url>" will generally be enough to grab the page and send it to stdout, so "set thisHTML to do shell script 'curl <url>'" is all you should need to populate an AppleScript variable with the same.
>
> 4. Is there DOM manipulation going on, for instance using JQuery, and if so do you care about getting the HTML before or after the DOM manipulation happens?
>
> In general, an HTTP request is an HTTP request, regardless of whether it's for a static HTML page or something generated dynamically. If this isn't working, then you likely have a problem with authentication or some dynamic variables not available to PHP when hitting it directly from curl. It'd help to see the results you are getting.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden