Re: Getting a read call before open
Re: Getting a read call before open
- Subject: Re: Getting a read call before open
- From: Terry Lambert <email@hidden>
- Date: Thu, 26 Jun 2008 15:33:20 -0700
The first page is loaded by execve only for the purposes of reading
the header to determine the magic number to see if it's an
interpreter, a Mach-o file, a universal binary, or "other" (non-
executable).
Once this happens, if it's Universal, the binary is "graded" and a
slice (Mach-o file encapsulated in the Universal binary container) has
its first 4K read; otherwise, if the "magic number" is "#!", then the
path after the ! is read and reinterpreted as a request to load that
instead, with the script as argv[0]., and it gors back to load the
first page of the interpreter. If the magic number indicates it's a
PPC binary, and you are on an Intel machine, then it implies an
interpreter of Rosetta and messes with the recorded p_comm field of
the process so it doesn't look like an interpreter is running.
Either way, you might get two or more 4K page reads before it settles
down into mach_loader. That code takes the 4K already read, and
starts going through the "load commands" list, mapping things into the
new process' address space. If one of then is a dynamic linker
segment, it loads the one matching the architecture information from
the final binary into the address space as well, and then the thread
state information is set from the thread state structure (one of the
things loaded).
At that point, the exec returns to user space, which causes the
program counter and other registers to be loaded in the thread state,
and execution starts there (usually in dyld).
None of the other pages end up coming in, until you start using them,
and take a fault, which causes your FS to get contacted again to
supply the requested information.
You probably don't see a lot of this unless you are the boot device,
since you are not where most of this stuff comees from, and so you are
"out of the loop" about the other activity that's happening.
You sound instead like you're getting your first fault, and there is a
problem with your page-in function declaration.
-- Terry
On Jun 26, 2008, at 12:42 PM, shailesh jain wrote:
Hi,
The follow up question, I guess.
Now I have just implemented a prototype VOP_PAGEIN. But the
parameters passed to this function {size, offset, vm_offset} are all
set to
zero. I am clueless, as why is it set to zero ?
Ideally shouldn't offset be 4096?, because first page has already
been loaded by execve.
Note: My filesystem does not support caching. Also, let me know if
this question is more appropriate on filesystem-dev mailing list.
/Shail
On Wed, Jun 25, 2008 at 11:16 PM, shailesh jain <email@hidden
> wrote:
Hi,
Thanks. I actually figured out that I had not yet implemented
VOP_PAGEIN.
Thus, execve used to only load 1st page (4096 bytes) and then later
depended on page fault
to load remaining bytes. But since I didn't implement VOP_PAGEIN,
the application just used to hang.
Thanks for the information.
/Shail
On Wed, Jun 25, 2008 at 11:03 PM, Terry Lambert <email@hidden>
wrote:
On Jun 25, 2008, at 6:54 PM, shailesh jain wrote:
When I try to run executable over my filesystem, it just hangs (i.e
shell prompt never returns) when I tried to do implicit open in the
read call to my filesystem.
Digging through the source code, I found that execve calls vn_rdwr()
and subsequently, VOP_READ() call. This read is invoked to load
PAGESIZE bytes (4096) which my filesystem delivers it properly.
However, I do not get read call to load remaining bytes. I can't
seem to decipher that.
/Shail
On Wed, Jun 25, 2008 at 4:12 PM, shailesh jain <email@hidden
> wrote:
Is it legitimate for a filesystem to get a read call before open
call ? Also, how should a filesystem
handle such behavior (implicit opens and close ?)
Hi; sounds like you are writing a remote filesystem.
If you vend a vnode, be prepared to get any number of calls upon
it. Once it has been vended, yo have agreed to puture calls on it
until such time as it has been released back t you for recycling.
Certain filesystems dislike this (SMB, as an example, disallows
renames for open files, and we don't support ETXTBUSY unless the FS
client maintains "this has been exec'ed state" and returns it
itself). I understand that this can bother people, but until the
vnode has been invalidated, either by being given back to you as no
longer being needed by who you gave it to, or being deadfs'ed (e.g.
by a forced unmount from your side of things), it is the property of
whoever you vended it to.
Another situation where it's possible for this to happen is open/
mmap/close, where the memory mapping is kept active by a vnode
reference by the paging system (maybe you just did not notice the
open/close, and now you incorrectly think it's closed). Again, as
far as the kernel is concerned, the vnode pointer *is* the file.
For the specific case of exec, yes, there is a vn_rdwr() after you
vended the open vnode to the exec code via a lookup of a name. You
will potentially get VOP_READ() calls if it's a FAT file or you have
code signing, and you will definitely get vm mappings established
for the address space of the new process, so what you are seeing is
both reasonable and expected.
If you choose to rip the vnode out from under us, of course whatever
program you are running will crash the first time it page faults in
a clean page from the backing store you promised it when you gave
out the vp.
Hope this clears things up for you.
- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden