Re: mach-o section question
Re: mach-o section question
- Subject: Re: mach-o section question
- From: Andrew Myrick <email@hidden>
- Date: Mon, 21 Apr 2008 09:49:18 -0700
Responses inline, with the caveat that my Mach-O experience is
primarily with kext linking and loading, so my comments on user-space
behavior (questions 3-6) are wildly speculative. I hope they are
still of some use.
On Apr 21, 2008, at 5:35 AM, Army Research Lab wrote:
I started reading it over the weekend, and have a few questions
already:
1) I'm slightly confused by the ordering of sections and segments;
on page
23 of the PDF of the Mac OS X ABI Mach-O File Format Reference (the
PDF
complement of the link Andrew sent me above), says that the section
commands
for a segment directly follow the segment they are a part of, but
Fig. 1
implies all the segments are grouped together, and then all the
sections
follow after that; which is happening?
Each segment command header is followed directly by its section
_headers_. The figure is somewhat confusing (but correct), because it
considers the section headers to be part of the segment command and
does not illustrate them. The sections that are illustrated are the
section _data_, that is, the data that the section headers describe.
2) I know that sections are numbered started at 1; what happens if I
created
a section with a number of 0? If I created my own linker, would I
need to
reserve space for a section 0?
Section 0 is reserved for a special purpose. The nlist structure that
describes symbols has an n_sect field which specifies the section in
which a symbol lives. The value 0 is a reserved value called NO_SECT
used for symbols that don't have a section, such as absolute or
undefined symbols.
3) According to the note on page 12, tools are allowed to define
addition
section names; do they need to be registered with Apple in some way to
prevent namespace problems?
There isn't a registration requirement that I know if, though perhaps
someone else can chime in if I'm wrong. I would recommend some sort
of prefix to avoid namespace collision. Keep it short since you only
get 16 bytes :)
4) There is an __OBJC segment; are we allowed to create addition
segment
names, or will that throw off the loader? If we can create new
segments,
then question #3 can go away, as anything that needs to be tossed
into its
own section can go into the segment instead...
It will probably work just fine (try it and see), but if you don't
need special vm protections or have another compelling need for a new
segment to justify the memory penalty of page-alignment, I would
recommend sticking with a section.
5) The uuid_command referenced on page 20 uses a 128 bit UUID; is this
expected to be the output of uuid_generate(), or will any 128 bit UUID
generator work (in short, is there hidden info in the UUID that I
need to
keep straight)?
I believe that uuid_generate() simply ensures that the UUID is
sufficiently random, so you can likely put whatever you want in the
uuid_command with the understanding that if you are not actually
unique (as much as one can be in a finite namespace), dire
consequences may result.
6) I notice that there are a number of structs (e.g. section,
defined on
page 23) that have 16 element char arrays for their section names,
and the
docs say ASCII characters; that is a little imprecise, which set of
characters are allowed? Also, does the string have to be NULL
terminated?
More importantly, should all characters that aren't part of the name
be set
to NULL? Pure guess suggests that you'd memset() to 0, and then
only use
characters for which isgraph() returns true.
The link editor is likely using standard C-string functions like
strncmp and strlcpy to process these fields, so anything those
functions can deal with is probably safe. However, I would personally
stick to the set of characters that the documented names seem to make
use of, which is [0-9A-Za-z_]. The string should be null-terminated.
In case you're all wondering why I'm asking so many questions, it
occurred
to me over the weekend that one of the best ways of learning about a
file
format is to make a full reader/writer for it, from scratch. I
doubt I'll
really complete such a project, but I'd like to start on it at
least. My
goal is to write it in pure Python; that prevents me from cheating,
and
thinking I know how something works, when I really don't. It also
means
that there will be second implementation of the written standard out
there,
which should turn up any bugs in the docs, and may also turn up bugs
in the
loader.
It would be quite handy if one could someday "import macho" and profit
from this hard work. Please file Radars against the documentation
whenever it seems incorrect or unclear. Best of luck.
-Andrew
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden