• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Problems choosing an encoding for Word generated html
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Problems choosing an encoding for Word generated html


  • Subject: Problems choosing an encoding for Word generated html
  • From: Ken Tozier <email@hidden>
  • Date: Sun, 31 May 2009 19:08:19 -0400

Hi

I wrote an app that converts Word files into a simpler format by first converting from .doc to html using scripting and Word's "Save as Web page" command followed by using NSXMLDocument to extract the parts I need. I'm finding that there are no good options when it comes to choosing a character encoding for the saved html (this is set in Word) because it uses some custom tags to embed special characters like bullets and that UTF-8 chokes on.

My basic process is to
- Use Applescript to tell Word to convert from .doc to html and save as utf-8
- Read the resultant file into an NSString with NSUTF8StringEncoding


I've tried saving the html from Word as NSLatin1Encoding but many important characters like double-quotes, apostrophes, dashes etc are translated to cap "O's" with various diacritical marks.

Not really sure how to proceed as there doesn't seem to be a single encoding useable by NSString that will both translate the quotes and allow me to access Word's "special" characters. Anyone have any ideas how I can read the html and treat it as a mostly normal character string without resorting to a custom binary character translation class?

Thanks for any help
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Prev by Date: Re: NSTableViewSelectionHighlightStyleSourceList interferes with custom NSBrowserCell drawing
  • Next by Date: Re: NSAttributedString -size Crash
  • Previous by thread: Re: NSAttributedString -size Crash
  • Next by thread: Discovering the default font names and sizes on the iPhone
  • Index(es):
    • Date
    • Thread