• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Reading .doc files
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reading .doc files


  • Subject: Re: Reading .doc files
  • From: "Alastair J.Houghton" <email@hidden>
  • Date: Mon, 29 Sep 2003 10:58:50 +0100

On Monday, September 29, 2003, at 10:40 am, Lorenzo wrote:

Hi list,
I am trying to read Microsoft Word .doc files as pure text.
I already know there are utilities that do that, but I want to do that by
myself, programmatically.
When the header is fixed sized I can easily find the start of the pure text.
But when the offset of the first char of the pure text is variable, I cannot
locate that point even using the "File Information Block" (FIB).
I tried to read the "fib.fcMin" from the FIB block (standing by the
documentation it should return the offset value of the firts char in the
file), but this UInt32 variable returns a value that is not the offset of
the pure text. Sometimes it's even greater than the whole file size.

Does anyone know where I am doing wrong?

Yep. Microsoft Word was developed on PCs, which use x86 microprocessors. The x86 is little-endian, so Microsoft Word's file format uses little-endian numbers. You need to byte swap the number you get back; there are a set of functions in the Foundation kit that will do this for you, e.g. NSSwapLittleIntToHost().

By the way, the Word binary file format is quite complicated, especially if you plan on supporting all of the different (and incompatible) versions.

Kind regards,

Alastair.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: Reading .doc files
      • From: "Alastair J.Houghton" <email@hidden>
References: 
 >Reading .doc files (From: Lorenzo <email@hidden>)

  • Prev by Date: Re: NSStatusItem vs NSMenuExtra?
  • Next by Date: NSStatusItem "functional anomaly"?
  • Previous by thread: Reading .doc files
  • Next by thread: Re: Reading .doc files
  • Index(es):
    • Date
    • Thread