• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Convert HTML Text to Plain Text
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Convert HTML Text to Plain Text


  • Subject: Re: Convert HTML Text to Plain Text
  • From: Jim Underwood <email@hidden>
  • Date: Thu, 30 Jun 2016 00:56:49 +0000
  • Thread-topic: Convert HTML Text to Plain Text

Shane,

Thank you very much for this script.
Your ASObjC handler is 2.03 X as fast as Shell Handler, as tested using real world data from Evernote.  Fantastic!

I do have one question.  You posted a very similar script at MacScripters.net, which I just happened to run across.
I'm wondering what the purpose is of the differences between it and your script here?

From http://www.macscripter.net/viewtopic.php?pid=185140#p185140


use framework "Foundation"
use framework "AppKit"

-- classes, constants, and enums used
property NSUTF8StringEncoding : a reference to 4
property NSAttributedString : a reference to current application's NSAttributedString
property NSCharacterEncodingDocumentOption : a reference to current application's NSCharacterEncodingDocumentOption
property NSDictionary : a reference to current application's NSDictionary
property NSString : a reference to current application's NSString

set HTMLString to "Power #2 &" & "#8211; Lawyerin&" & "#8217;" -- HTML split for posting.
set theString to NSString's stringWithString:HTMLString
set dataStr to theString's dataUsingEncoding:NSUTF8StringEncoding
set options to NSDictionary's dictionaryWithObject:NSUTF8StringEncoding forKey:(NSCharacterEncodingDocumentOption)
set attStr to NSAttributedString's alloc()'s initWithHTML:dataStr options:options documentAttributes:(missing value)
set outputStr to attStr's |string|()
return outputStr as text


The key difference seems to be the use of options:
set options to NSDictionary's dictionaryWithObject:NSUTF8StringEncoding forKey:(NSCharacterEncodingDocumentOption)
set attStr to NSAttributedString's alloc()'s initWithHTML:dataStr options:options documentAttributes:(missing value)


May I ask, for my (and others) edification, what effect the options parameter has on the extraction of plain text from HTML text?

Also, what is the benefit of using the property statement to set the ASObjC objects?

Thanks.

Best Regards,

Jim Underwood
aka JMichaelTX


From: <applescript-users-bounces+jmichael=email@hidden> on behalf of Shane Stanley <email@hidden>
Date: Thu, Jun 16, 2016 at 8:56 PM
To: "ASUL (AppleScript)" <email@hidden>
Subject: Re: Convert HTML Text to Plain Text

On 17 Jun 2016, at 11:12 AM, Jim Underwood <email@hidden> wrote:

I'm looking for a faster, better method of converting HTML text to plain text.
I'm hoping ASObjC can come to the rescue.  ðŸ˜„

I have an Evernote script that might process thousands of Notes, and for each Note I need the plain text.

Any ideas/suggestions?

I don't know how much faster this will be; it uses the same process in the end:
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Convert HTML Text to Plain Text
      • From: Shane Stanley <email@hidden>
References: 
 >Convert HTML Text to Plain Text (From: Jim Underwood <email@hidden>)
 >Re: Convert HTML Text to Plain Text (From: Shane Stanley <email@hidden>)

  • Prev by Date: access to reparented enclosing script
  • Next by Date: window order
  • Previous by thread: Re: Convert HTML Text to Plain Text
  • Next by thread: Re: Convert HTML Text to Plain Text
  • Index(es):
    • Date
    • Thread