Reading .doc files
Reading .doc files
- Subject: Reading .doc files
- From: Lorenzo <email@hidden>
- Date: Mon, 29 Sep 2003 11:40:37 +0200
Hi list,
I am trying to read Microsoft Word .doc files as pure text.
I already know there are utilities that do that, but I want to do that by
myself, programmatically.
When the header is fixed sized I can easily find the start of the pure text.
But when the offset of the first char of the pure text is variable, I cannot
locate that point even using the "File Information Block" (FIB).
I tried to read the "fib.fcMin" from the FIB block (standing by the
documentation it should return the offset value of the firts char in the
file), but this UInt32 variable returns a value that is not the offset of
the pure text. Sometimes it's even greater than the whole file size.
Does anyone know where I am doing wrong?
I do that:
--------------------
NSData *stringData = [NSData dataWithContentsOfFile:filePath];
[stringData getBytes:headerBuffer range:NSMakeRange(0, 2048)];
FIB fib = (*(FIB*)headerBuffer);
NSLog(@"y.fcMin %d", (int)fib.fcMin);
The FIB struct, as I copied from the documentation starts with:
--------------------
typedef struct{
UInt16 wIdent;
UInt16 nFib;
UInt16 nProduct;
UInt16 lid;
UInt16 pnNext;
UInt16 fDot;
UInt16 fGlsy;
UInt16 fComplex;
UInt16 fHasPic;
UInt16 cQuickSaves;
UInt16 fEncrypted;
UInt16 unused10_9;
UInt16 fReadOnlyRecommended;
UInt16 fWriteReservation;
UInt16 fExtChar;
UInt16 unused10_13;
UInt16 nFibBack;
UInt32 lKey;
UInt8 envr;
UInt8 unused19;
UInt16 chse;
UInt16 chseTables;
UInt32 fcMin; // file offset of first character of text.
// In non-complexfiles a CP can be transformed into an
// FC by the following transformation:
// fc = cp + fib.fcMin.
UInt32 fcMac; // file offset of last character
// of text in document text stream+ 1
// .... continues, it's very long
}FIB;
Best Regards
--
Lorenzo
email: email@hidden
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.