Re: NSCharacterSet Parsing MS Word 97 files? <Newbie>
Re: NSCharacterSet Parsing MS Word 97 files? <Newbie>
- Subject: Re: NSCharacterSet Parsing MS Word 97 files? <Newbie>
- From: Ken Tozier <email@hidden>
- Date: Mon, 4 Jul 2005 12:58:11 -0400
On Jul 4, 2005, at 12:37 PM, Vince Ackerman wrote:
I need to parse lengthy MS Word 97 documents. I have read the file
in as an NSString but need to filter out all the extraneous
characters except the actual text. I need to see all the standard
keyboard characters and (hopefully/eventually) parse out the
information into a Core Data database. I can achieve this using
Word to save the file as straight Text, but don't want the end user
to have to do this to each document every time. I was hoping
there was a way to do it programmatically without a lot of work.
Is there a way to copy this string to another string with
NSCharacterSet or NSScanner? Or perhaps a better way to open and
read the file into a NSString without the MSWord encoding? I don't
fully understand what NSCharacterSet will filter, and I don't want
to alter the actual visible text in the file.
Any ideas would be greatly appreciated! Perhaps someone's got a
snippet of code so I don't have to re-invent this wheel?
I looked into parsing word files a a year or so ago and the file
format is a beast. I wouldn't recommend it. Your best bet might be to
define an Automator action that opens Word files in Word and saves
them as rich text (or text depending) and then run the action from
inside your code using NSAppleScript. Haven't tried it myself, but it
seems like it would be much easier than parsing the Word file yourself.
Ken
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden