site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com Thread-index: Acij1NC1D1OY7w/IEd2gcQAdT0T19A== Thread-topic: mach-o section question User-agent: Microsoft-Entourage/11.4.0.080122 On 4/21/08 12:49 PM, "Andrew Myrick" <amyrick@apple.com> wrote:
Responses inline, with the caveat that my Mach-O experience is primarily with kext linking and loading, so my comments on user-space behavior (questions 3-6) are wildly speculative. I hope they are still of some use.
As a question is/are the people responsible for keeping "Mac OS X ABI Mach-O File Format Reference" and "Mach-O Programming Topics" on the list? It might help if/when I/we file bugs to be able to point out this whole thread.
On Apr 21, 2008, at 5:35 AM, Army Research Lab wrote:
I started reading it over the weekend, and have a few questions already:
1) I'm slightly confused by the ordering of sections and segments; on page 23 of the PDF of the Mac OS X ABI Mach-O File Format Reference (the PDF complement of the link Andrew sent me above), says that the section commands for a segment directly follow the segment they are a part of, but Fig. 1 implies all the segments are grouped together, and then all the sections follow after that; which is happening?
Each segment command header is followed directly by its section _headers_. The figure is somewhat confusing (but correct), because it considers the section headers to be part of the segment command and does not illustrate them. The sections that are illustrated are the section _data_, that is, the data that the section headers describe.
OK, that was what I thought was going on. The problem is that when I first saw the figure, I thought the following was happening: --- Load commands segment segment segment section 1 section 2 section 3 --- Data Hmmm... I guess the first bug I need to file is the need for some fairly complete examples of layouts of mach-o files (sans actual data/code). I'll do that tonight
2) I know that sections are numbered started at 1; what happens if I created a section with a number of 0? If I created my own linker, would I need to reserve space for a section 0?
Section 0 is reserved for a special purpose. The nlist structure that describes symbols has an n_sect field which specifies the section in which a symbol lives. The value 0 is a reserved value called NO_SECT used for symbols that don't have a section, such as absolute or undefined symbols.
OK, I thought so.
3) According to the note on page 12, tools are allowed to define addition section names; do they need to be registered with Apple in some way to prevent namespace problems?
There isn't a registration requirement that I know if, though perhaps someone else can chime in if I'm wrong. I would recommend some sort of prefix to avoid namespace collision. Keep it short since you only get 16 bytes :)
4) There is an __OBJC segment; are we allowed to create addition segment names, or will that throw off the loader? If we can create new segments, then question #3 can go away, as anything that needs to be tossed into its own section can go into the segment instead...
It will probably work just fine (try it and see), but if you don't need special vm protections or have another compelling need for a new segment to justify the memory penalty of page-alignment, I would recommend sticking with a section.
Well, the only thing I'm thinking about is another long running mental exercise I've had; I'm working out a new (hopefully better) programming language. If I go to the effort of reading/writing the mach-O file format, then I may as well take advantage of it and see if I can't write a compiler to put something into it! :D And, like Objective-C, I'm going to need my own runtime, so it would be nice to be able to do this.
5) The uuid_command referenced on page 20 uses a 128 bit UUID; is this expected to be the output of uuid_generate(), or will any 128 bit UUID generator work (in short, is there hidden info in the UUID that I need to keep straight)?
I believe that uuid_generate() simply ensures that the UUID is sufficiently random, so you can likely put whatever you want in the uuid_command with the understanding that if you are not actually unique (as much as one can be in a finite namespace), dire consequences may result.
Is there any way to determine if there is a collision besides 'dire consequences'? I can think of a few ideas on how to protect against problems, but they require the loader to be notified any time a new mach-O file is loaded onto the system so it can check, and, if necessary, modify, UUIDs for use on a particular system.
6) I notice that there are a number of structs (e.g. section, defined on page 23) that have 16 element char arrays for their section names, and the docs say ASCII characters; that is a little imprecise, which set of characters are allowed? Also, does the string have to be NULL terminated? More importantly, should all characters that aren't part of the name be set to NULL? Pure guess suggests that you'd memset() to 0, and then only use characters for which isgraph() returns true.
The link editor is likely using standard C-string functions like strncmp and strlcpy to process these fields, so anything those functions can deal with is probably safe. However, I would personally stick to the set of characters that the documented names seem to make use of, which is [0-9A-Za-z_]. The string should be null-terminated.
OK, this is about what I thought I should do.
In case you're all wondering why I'm asking so many questions, it occurred to me over the weekend that one of the best ways of learning about a file format is to make a full reader/writer for it, from scratch. I doubt I'll really complete such a project, but I'd like to start on it at least. My goal is to write it in pure Python; that prevents me from cheating, and thinking I know how something works, when I really don't. It also means that there will be second implementation of the written standard out there, which should turn up any bugs in the docs, and may also turn up bugs in the loader.
It would be quite handy if one could someday "import macho" and profit from this hard work. Please file Radars against the documentation whenever it seems incorrect or unclear. Best of luck.
I agree, I'd like to release it someday (assuming I get anywhere with it!) Thanks, Cem Karan _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com