Re: Reading a binary file format?
Re: Reading a binary file format?
- Subject: Re: Reading a binary file format?
- From: Stephen Hoffman <email@hidden>
- Date: Fri, 12 Oct 2007 09:50:00 -0400
- Organization: HoffmanLabs LLC
Keith Blount wrote:
I need to read what I assume is a binary file into my
program. I know where to expect the various parts of
the file, I'm just not sure how to read them in -
probably because I'm self-taught at Cocoa (Kochan &
Hillegass) and it requires a lower level of C, I'm not
sure.
So, suppose you have a file that contains some text
but also some (presumably binary) integer information.
If you open it up in TextEdit, you can see the text,
but there are some invisible characters that must be
the binary information that is unrecognised by
TextEdit. How would you go about reading it into a
Cocoa program? For instance, say the file has the
integers 'SCLT' and 0 as part of its header, and some
integers after it, and then 'PLST', and then some
integers and then bytes representing text data. In
TextEdit, you would see "SCLT PLST" followed by the
text. How would I go about reading all of the
information and accessing the integers hidden away in
there etc? Can I do this with NSData or NSString
methods, or do I need to delve deeper into C?
Do you have the original C code, or some idea of the record structure?
If you do, the file is probably going to be trivial to fairly trivial to
read in C. The original programmer has probably issued a fopen() and a
series of fwrite() calls of C record structures, then fclose() -- or a
similar sequence from the other I/O calls available in the C standard
library. You'd use fopen(), and a series of fread() calls processing
each as a C record structure into your application until feof() lights
up, followed by an fclose().
Depending on the requirements and the file structures, you might end up
using fscanf() or such to process the input as you read it in.
If the structures are not fixed, things get a bit more interesting. Pun
intended. I've dealt with C programs that process the bytes in smaller
units; where the records are variable-length, and are dependent on the
run-time context. This processing is made involved and more difficult
if you don't have the source code and/or the data definitions; if you're
reverse-engineering the files.
It's also possible to see a combination of fixed and variable data,
where the file contains a linefeed termination, or where there's a fixed
header for each records and a count of bytes.
One wrinkle here: you'll also need to know whether the data is
little-endian, or big-endian; what the byte order is. If you know the
host system that generated the file, you can usually figure that out.
About half the systems around are big-endian, and the others are
little-endian. (And then there's the PDP-11, but I digress.)
And another wrinkle: some compilers can insert pad bytes into the data.
The compiler aligns a longword within a structure at a longword
boundary; at an address that ends in %x0, %x4, %x8 or %xc. This can
mean that padding bytes -- unused bytes -- are inserted into the data.
Most compilers do not do this, but I've encountered some that do. It's
possible for a programmer to disable this alignment, and pack the record
structures. It's also possible that the programmer didn't realize the
compiler inserted this data, so what you see in source code might not
match what you see in the file dump.
To poke around inside the file itself, use the shell od (octal dump) or
hexdump commands. This will show you the bytes and byte counts, from
which the underlying structures can often be discerned.
In summary, there are a whole pile of different ways to write a binary
file in C. I'd not expect to be able to read a random C data file
directly in Cocoa without using some C code; without going into more
effort than reading it using C and converting the fields and (I assume)
records over into Cocoa structures as each is read in.
Stephen Hoffman
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden