Re: intercepting system calls?
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com User-agent: Mutt/1.4.2.1i On Sat, Jan 21, 2006 at 01:10:13PM -0800, Kevin Van Vechten wrote:
If you're not trying to corral a hostile environment, a simple way to intercept syscalls is to override them at the Libsystem layer. Define your own functions in a dylib, and then se the DYLD_INSERT_LIBRARIES and DYLD_FORCE_FLAT_NAMESPACE environment variables.
What I'm trying to do is create something like this: http://www.cse.nd.edu/~ccl/software/parrot/ I want to be able to capture the I/O calls of an application and then ship them back over the network to be handled by another machine. I work on a distributed computing system (similiar to XGrid) and we want to able to run jobs on machines that don't have access to the same shared filesystem, so we trap system calls, send them back to the "submitting machine" and execute them there. The remote job, as far as it can tell, has access to all of the files as it would on the submitting machine (yes, obviously the idea is that the compute time to system call ratio is high, because remote system calls are many many times slower, and yes we take care of security) The problem with replacing at the library level is that the libraries may have private interfaces - finding all of the call points that I need to interpose on can be hard, and if I miss just one the game is over. We've been doing this for years on Linux, and for the longest time we did interpose at the library level - and every time a new Linux distro came out and it used a new libc, we'd have to figure out what new interfaces it was using - besides open, we needed to catch __open, __open2, __open3OnlyUsedInOnePlace, and __somethingCalledFooButReallyOpen Because those calls are all at user level, there was no way for us to find out if one of them got called by some code path we didn't catch. The system call interface was much narrower, so there are far fewer things we need to trap. Even better, because it's a trap, we could capture ALL of them, and anything we didn't know what to do with we could detect at runtime. It's much slower, but we're dominated by the network latency anyway so we didn't care. We want to stay out of the other processes address space as much as possible, and we certainly want to avoid executing code in the other processes address space (I took a look at APE - as an application writer it made my skin crawl :) Similarly, I want to avoid trapping in with a KEXT - I want to convince other people to use my software at low-risk to themselves, and asking them to install a KEXT doesn't jive with that. It would also mean that I'd see every system call on the machine, which I don't want. The ideal system lets my tracing process capture the system calls of another process running at the same privilege level - that way, if I'm running as use 'batchsystem', the only processes I can capture are 'batchsystem' processes. It also seems like Apple could keep the system call interface stable between versions - the great thing about system calls is if you need to change the meaning of one, you can just create another system call with a different name. You can leave the old one stable, and just change the framework to call the new one - if someone has found a way to call the actual system call, the can keep using it. Thanks, -Erik _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com
participants (1)
-
Erik Paulson