Re: HTML parsing
Re: HTML parsing
- Subject: Re: HTML parsing
- From: William Dockery <email@hidden>
- Date: Thu, 09 Jun 2016 06:14:41 -0500
Hello, I am an intermediate AppleScripter, and I am starting to use JavaScript to scrape websites for my personal use. I have learned how to select elements by element ID. But I have not yet learned how to select a list of HTML snippets that share a certain pattern of tags or tag values.
Right now, I have successfully navigated several pages deep into a particular website, and I now want to select the first element (which is the most recent) in a list of dated documents on the web page. There are 20 elements in the list (20 downloadable documents). I really just want to click on the first (most recent) document. But it would be nice to get a more general solution—i.e., learn how to select the whole list and cycle through the list with a test condition. So any solution along those lines would be appreciated.
I have read about how to accomplish such a task with plain JavaScript or JQuery, but my initial attempts have resulted in “missing value” in my AppleScript—i.e., I am not successfully selecting the elements I want to target.
I’m using ScriptEditor 2.8.1, AppleScript 2.5, and the latest version of El Capitan. On a recent MacBook Pro.
Any sample JavaScript code would be helpful. A representative sample of the source HTML is pasted below. Within this HTML, the phrase:
<A href=“#”
is not a unique phrase on this web page, but the phrase:
onclick=“refreshSession”
is unique to the list I am trying to capture.
And the date-type of the inner HTML in this code is also unique—i.e., all of my desired elements have a date as the inner HTML, such as 05/12/2016; and no other elements on this web page have a date in the inner HTML.
Thanks for any help.
William
<div style="clear: both; padding: 0px;"></div>
<BR>
<p>
<A href="#"onclick="refreshSessionAndPopupWindow('viewOnlineStatements.do?phase=display&selectedStatementDate=Thu+May+12+00:00:00+CDT+2016&selectedStatementId=OD_djcxMjYtNzA0OS03MDUwLTcwMzYtTFBBMTEtMTMzNzNGQUFRQS01NDk3MDctMzkyMzktMjM1NzI2MC0xMzA4NjktNzgtNzktOTYxNC0xMjMwLTAtXgE0MDM3NjYwMDE3ODc4OTcBNDAzNzY2MDAxNzg3ODk3NgFXSUxMSUFNIEQgRE9DS0VSAVkgSUlJATE2OTM0AUNHU1RNVFAxAQIBAgECAQIBAgECATAwMDA2MTAwAQIBAgECAQIBQ1JFRElUIExJTkUgSU5DUkVBU0UgT0ZGRVIBQ09OVkVOSUVOVCBDUkVESVQgQ0FSRCBQQVlNRU5UIE9QVElPTlMBVFJVU1RFRCBFWFBFUklFTkNFIEFORCBTVVBQT1JUIEZST00gRUxBTiBGSU5BTkNJQUwgU0VSVklDRVMBR0VUIFNNQVJUIEFCT1VUIENSRURJVC1DSEVDSyBPVVQgV1dXLlNNQVJUQ1JFRElUTUFUVEVSUy5DT00BAgEzOTIzOQE2ATAwMDAyMDM1MTY1OTE5ODkwOTA1_Disk_A_CGSTMTP1_CCARD_DATE|||05/12/2016')">05/12/2016</a>
<br>
<br>
<A href="#"onclick="refreshSessionAndPopupWindow('viewOnlineStatements.do?phase=display&selectedStatementDate=Tue+Apr+12+00:00:00+CDT+2016&selectedStatementId=OD_djcxMjYtNzA0OS03MDUwLTcwMzYtTFBBMTEtMTMxODJGQUFQQS02MjczMDctNDc5OTMtMTMwMzgzLTEzMjIzOS03OC03OS05NDg0LTEyMzAtMC1eATQwMzc2NjAwMTc4Nzg5NwE0MDM3NjYwMDE3ODc4OTc2AVdJTExJQU0gRCBET0NLRVIBWSBJSUkBMTY5MDQBQ0dTVE1UUDEBAgECAQIBAgECAQIBMDAwMDUxMjEBAgECAUNSRURJVCBMSU5FIElOQ1JFQVNFIE9GRkVSAUNPTlZFTklFTlQgQ1JFRElUIENBUkQgUEFZTUVOVCBPUFRJT05TAVRSVVNURUQgRVhQRVJJRU5DRSBBTkQgU1VQUE9SVCBGUk9NIEVMQU4gRklOQU5DSUFMIFNFUlZJQ0VTAUZMQUdTVEFSIEJBTksgSE9NRSBFUVVJVFkgTElORSBPRiBDUkVESVQBQ0hFQ0sgWU9VUiBDUkVESVQgU0NPUkUBQ0hFQ0sgWU9VUiBDUkVESVQgU0NPUkUBAgE0Nzk5MwE2ATAwMDAyMDM1MTY1OTE5ODkwOTA1_Disk_A_CGSTMTP1_CCARD_DATE|||04/12/2016')">04/12/2016</a>
<br>
<br>
<A href="#"onclick="refreshSessionAndPopupWindow('viewOnlineStatements.do?phase=display&selectedStatementDate=Fri+Mar+11+00:00:00+CST+2016&selectedStatementId=OD_djcxMjYtNzA0OS03MDUwLTcwMzYtTFBBMTEtMTMwMDFGQUFQQS01NjUxMjEtNDI0ODktMC0xMzI0NjktNzgtNzktOTM2MC0xMjMwLTAtXgE0MDM3NjYwMDE3ODc4OTcBNDAzNzY2MDAxNzg3ODk3NgFXSUxMSUFNIEQgRE9DS0VSAVkgSUlJATE2ODcyAUNHU1RNVFAxAQIBAgECAQIBAgECATAwMDA1MTQ5AQIBAgECAUNPTlZFTklFTlQgQ1JFRElUIENBUkQgUEFZTUVOVCBPUFRJT05TAVRSVVNURUQgRVhQRVJJRU5DRSBBTkQgU1VQUE9SVCBGUk9NIEVMQU4gRklOQU5DSUFMIFNFUlZJQ0VTAUNSRURJVCBMSU5FIElOQ1JFQVNFIE9GRkVSAVBBWSBPTkxJTkUgVEhFIEVBU1kgV0FZIFdJVEggVklTQSBDSEVDS09VVAFQQVkgQUxMIFlPVVIgTU9OVEhMWSBCSUxMUyBUSEUgRUFTWSBXQVkBAgE0MjQ4OQE2ATAwMDAyMDM1MTY1OTE5ODkwOTA1_Disk_A_CGSTMTP1_CCARD_DATE|||03/11/2016')">03/11/2016</a>
<br>
<br>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden