Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: mach-o section question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mach-o section question

Subject: Re: mach-o section question
From: Andrew Myrick <email@hidden>
Date: Mon, 21 Apr 2008 09:49:18 -0700

Responses inline, with the caveat that my Mach-O experience is primarily with kext linking and loading, so my comments on user-space behavior (questions 3-6) are wildly speculative. I hope they are still of some use.

On Apr 21, 2008, at 5:35 AM, Army Research Lab wrote:

I started reading it over the weekend, and have a few questions already:

1) I'm slightly confused by the ordering of sections and segments; on page 23 of the PDF of the Mac OS X ABI Mach-O File Format Reference (the PDF complement of the link Andrew sent me above), says that the section commands for a segment directly follow the segment they are a part of, but Fig. 1 implies all the segments are grouped together, and then all the sections follow after that; which is happening?

Each segment command header is followed directly by its section _headers_. The figure is somewhat confusing (but correct), because it considers the section headers to be part of the segment command and does not illustrate them. The sections that are illustrated are the section _data_, that is, the data that the section headers describe.

2) I know that sections are numbered started at 1; what happens if I created a section with a number of 0? If I created my own linker, would I need to reserve space for a section 0?

Section 0 is reserved for a special purpose. The nlist structure that describes symbols has an n_sect field which specifies the section in which a symbol lives. The value 0 is a reserved value called NO_SECT used for symbols that don't have a section, such as absolute or undefined symbols.

3) According to the note on page 12, tools are allowed to define addition section names; do they need to be registered with Apple in some way to prevent namespace problems?

There isn't a registration requirement that I know if, though perhaps someone else can chime in if I'm wrong. I would recommend some sort of prefix to avoid namespace collision. Keep it short since you only get 16 bytes :)

4) There is an __OBJC segment; are we allowed to create addition segment names, or will that throw off the loader? If we can create new segments, then question #3 can go away, as anything that needs to be tossed into its own section can go into the segment instead...

It will probably work just fine (try it and see), but if you don't need special vm protections or have another compelling need for a new segment to justify the memory penalty of page-alignment, I would recommend sticking with a section.

5) The uuid_command referenced on page 20 uses a 128 bit UUID; is this expected to be the output of uuid_generate(), or will any 128 bit UUID generator work (in short, is there hidden info in the UUID that I need to keep straight)?

I believe that uuid_generate() simply ensures that the UUID is sufficiently random, so you can likely put whatever you want in the uuid_command with the understanding that if you are not actually unique (as much as one can be in a finite namespace), dire consequences may result.

6) I notice that there are a number of structs (e.g. section, defined on page 23) that have 16 element char arrays for their section names, and the docs say ASCII characters; that is a little imprecise, which set of characters are allowed? Also, does the string have to be NULL terminated? More importantly, should all characters that aren't part of the name be set to NULL? Pure guess suggests that you'd memset() to 0, and then only use characters for which isgraph() returns true.

The link editor is likely using standard C-string functions like strncmp and strlcpy to process these fields, so anything those functions can deal with is probably safe. However, I would personally stick to the set of characters that the documented names seem to make use of, which is [0-9A-Za-z_]. The string should be null-terminated.

In case you're all wondering why I'm asking so many questions, it occurred to me over the weekend that one of the best ways of learning about a file format is to make a full reader/writer for it, from scratch. I doubt I'll really complete such a project, but I'd like to start on it at least. My goal is to write it in pure Python; that prevents me from cheating, and thinking I know how something works, when I really don't. It also means that there will be second implementation of the written standard out there, which should turn up any bugs in the docs, and may also turn up bugs in the loader.

It would be quite handy if one could someday "import macho" and profit from this hard work. Please file Radars against the documentation whenever it seems incorrect or unclear. Best of luck.

-Andrew

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: mach-o section question
From: Army Research Lab <email@hidden>


References:  
  >Re: mach-o section question (From: Army Research Lab <email@hidden>)




Prev by Date:
Re: mach-o section question

Next by Date:
Re: mach-o section question

Previous by thread:
Re: mach-o section question

Next by thread:
Re: mach-o section question

Index(es):

Date
Thread