Re: mach-o section question
Re: mach-o section question
- Subject: Re: mach-o section question
- From: Army Research Lab <email@hidden>
- Date: Mon, 21 Apr 2008 13:26:21 -0400
- Thread-topic: mach-o section question
On 4/21/08 12:49 PM, "Andrew Myrick" <email@hidden> wrote:
> Responses inline, with the caveat that my Mach-O experience is
> primarily with kext linking and loading, so my comments on user-space
> behavior (questions 3-6) are wildly speculative. I hope they are
> still of some use.
As a question is/are the people responsible for keeping "Mac OS X ABI Mach-O
File Format Reference" and "Mach-O Programming Topics" on the list? It
might help if/when I/we file bugs to be able to point out this whole thread.
> On Apr 21, 2008, at 5:35 AM, Army Research Lab wrote:
>>
>> I started reading it over the weekend, and have a few questions
>> already:
>>
>> 1) I'm slightly confused by the ordering of sections and segments;
>> on page
>> 23 of the PDF of the Mac OS X ABI Mach-O File Format Reference (the
>> PDF
>> complement of the link Andrew sent me above), says that the section
>> commands
>> for a segment directly follow the segment they are a part of, but
>> Fig. 1
>> implies all the segments are grouped together, and then all the
>> sections
>> follow after that; which is happening?
>
> Each segment command header is followed directly by its section
> _headers_. The figure is somewhat confusing (but correct), because it
> considers the section headers to be part of the segment command and
> does not illustrate them. The sections that are illustrated are the
> section _data_, that is, the data that the section headers describe.
OK, that was what I thought was going on. The problem is that when I first
saw the figure, I thought the following was happening:
--- Load commands
segment
segment
segment
section 1
section 2
section 3
--- Data
Hmmm... I guess the first bug I need to file is the need for some fairly
complete examples of layouts of mach-o files (sans actual data/code). I'll
do that tonight
>> 2) I know that sections are numbered started at 1; what happens if I
>> created
>> a section with a number of 0? If I created my own linker, would I
>> need to
>> reserve space for a section 0?
>
> Section 0 is reserved for a special purpose. The nlist structure that
> describes symbols has an n_sect field which specifies the section in
> which a symbol lives. The value 0 is a reserved value called NO_SECT
> used for symbols that don't have a section, such as absolute or
> undefined symbols.
OK, I thought so.
>> 3) According to the note on page 12, tools are allowed to define
>> addition
>> section names; do they need to be registered with Apple in some way to
>> prevent namespace problems?
>
> There isn't a registration requirement that I know if, though perhaps
> someone else can chime in if I'm wrong. I would recommend some sort
> of prefix to avoid namespace collision. Keep it short since you only
> get 16 bytes :)
>
>>
>>
>> 4) There is an __OBJC segment; are we allowed to create addition
>> segment
>> names, or will that throw off the loader? If we can create new
>> segments,
>> then question #3 can go away, as anything that needs to be tossed
>> into its
>> own section can go into the segment instead...
>
> It will probably work just fine (try it and see), but if you don't
> need special vm protections or have another compelling need for a new
> segment to justify the memory penalty of page-alignment, I would
> recommend sticking with a section.
Well, the only thing I'm thinking about is another long running mental
exercise I've had; I'm working out a new (hopefully better) programming
language. If I go to the effort of reading/writing the mach-O file format,
then I may as well take advantage of it and see if I can't write a compiler
to put something into it! :D And, like Objective-C, I'm going to need my
own runtime, so it would be nice to be able to do this.
>> 5) The uuid_command referenced on page 20 uses a 128 bit UUID; is this
>> expected to be the output of uuid_generate(), or will any 128 bit UUID
>> generator work (in short, is there hidden info in the UUID that I
>> need to
>> keep straight)?
>
> I believe that uuid_generate() simply ensures that the UUID is
> sufficiently random, so you can likely put whatever you want in the
> uuid_command with the understanding that if you are not actually
> unique (as much as one can be in a finite namespace), dire
> consequences may result.
Is there any way to determine if there is a collision besides 'dire
consequences'? I can think of a few ideas on how to protect against
problems, but they require the loader to be notified any time a new mach-O
file is loaded onto the system so it can check, and, if necessary, modify,
UUIDs for use on a particular system.
>> 6) I notice that there are a number of structs (e.g. section,
>> defined on
>> page 23) that have 16 element char arrays for their section names,
>> and the
>> docs say ASCII characters; that is a little imprecise, which set of
>> characters are allowed? Also, does the string have to be NULL
>> terminated?
>> More importantly, should all characters that aren't part of the name
>> be set
>> to NULL? Pure guess suggests that you'd memset() to 0, and then
>> only use
>> characters for which isgraph() returns true.
>
> The link editor is likely using standard C-string functions like
> strncmp and strlcpy to process these fields, so anything those
> functions can deal with is probably safe. However, I would personally
> stick to the set of characters that the documented names seem to make
> use of, which is [0-9A-Za-z_]. The string should be null-terminated.
OK, this is about what I thought I should do.
>> In case you're all wondering why I'm asking so many questions, it
>> occurred
>> to me over the weekend that one of the best ways of learning about a
>> file
>> format is to make a full reader/writer for it, from scratch. I
>> doubt I'll
>> really complete such a project, but I'd like to start on it at
>> least. My
>> goal is to write it in pure Python; that prevents me from cheating,
>> and
>> thinking I know how something works, when I really don't. It also
>> means
>> that there will be second implementation of the written standard out
>> there,
>> which should turn up any bugs in the docs, and may also turn up bugs
>> in the
>> loader.
>>
>
> It would be quite handy if one could someday "import macho" and profit
> from this hard work. Please file Radars against the documentation
> whenever it seems incorrect or unclear. Best of luck.
I agree, I'd like to release it someday (assuming I get anywhere with it!)
Thanks,
Cem Karan
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden