• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Reading a binary file format?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reading a binary file format?


  • Subject: Re: Reading a binary file format?
  • From: Stephen Hoffman <email@hidden>
  • Date: Fri, 12 Oct 2007 09:50:00 -0400
  • Organization: HoffmanLabs LLC


Keith Blount wrote:
I need to read what I assume is a binary file into my
program. I know where to expect the various parts of
the file, I'm just not sure how to read them in -
probably because I'm self-taught at Cocoa (Kochan &
Hillegass) and it requires a lower level of C, I'm not
sure.

So, suppose you have a file that contains some text
but also some (presumably binary) integer information.
If you open it up in TextEdit, you can see the text,
but there are some invisible characters that must be
the binary information that is unrecognised by
TextEdit. How would you go about reading it into a
Cocoa program? For instance, say the file has the
integers 'SCLT' and 0 as part of its header, and some
integers after it, and then 'PLST', and then some
integers and then bytes representing text data. In
TextEdit, you would see "SCLT PLST" followed by the
text. How would I go about reading all of the
information and accessing the integers hidden away in
there etc? Can I do this with NSData or NSString
methods, or do I need to delve deeper into C?

Do you have the original C code, or some idea of the record structure?


If you do, the file is probably going to be trivial to fairly trivial to read in C. The original programmer has probably issued a fopen() and a series of fwrite() calls of C record structures, then fclose() -- or a similar sequence from the other I/O calls available in the C standard library. You'd use fopen(), and a series of fread() calls processing each as a C record structure into your application until feof() lights up, followed by an fclose().

Depending on the requirements and the file structures, you might end up using fscanf() or such to process the input as you read it in.

If the structures are not fixed, things get a bit more interesting. Pun intended. I've dealt with C programs that process the bytes in smaller units; where the records are variable-length, and are dependent on the run-time context. This processing is made involved and more difficult if you don't have the source code and/or the data definitions; if you're reverse-engineering the files.

It's also possible to see a combination of fixed and variable data, where the file contains a linefeed termination, or where there's a fixed header for each records and a count of bytes.

One wrinkle here: you'll also need to know whether the data is little-endian, or big-endian; what the byte order is. If you know the host system that generated the file, you can usually figure that out. About half the systems around are big-endian, and the others are little-endian. (And then there's the PDP-11, but I digress.)

And another wrinkle: some compilers can insert pad bytes into the data. The compiler aligns a longword within a structure at a longword boundary; at an address that ends in %x0, %x4, %x8 or %xc. This can mean that padding bytes -- unused bytes -- are inserted into the data. Most compilers do not do this, but I've encountered some that do. It's possible for a programmer to disable this alignment, and pack the record structures. It's also possible that the programmer didn't realize the compiler inserted this data, so what you see in source code might not match what you see in the file dump.

To poke around inside the file itself, use the shell od (octal dump) or hexdump commands. This will show you the bytes and byte counts, from which the underlying structures can often be discerned.

In summary, there are a whole pile of different ways to write a binary file in C. I'd not expect to be able to read a random C data file directly in Cocoa without using some C code; without going into more effort than reading it using C and converting the fields and (I assume) records over into Cocoa structures as each is read in.

Stephen Hoffman

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: Reading a binary file format?
      • From: Robert Martin <email@hidden>
  • Prev by Date: Re: Reading a binary file format?
  • Next by Date: a link as attribute in core data?
  • Previous by thread: Re: Reading a binary file format?
  • Next by thread: Re: Reading a binary file format?
  • Index(es):
    • Date
    • Thread