Re: boundary crossing latencies

24 Jun 2004

      Jim Magee writes:
...
On Jun 22, 2004, at 5:30 PM, Andrew Gallatin wrote:
...
...
I found that MacOSX 10.3.3's ioctl takes nearly 4x as long as the same
...
...
code under linux ppc64 on the same exact G5.  And it takes 8x as long
...
...
as
...
...
the same ioctl on an AMD64.
...

...
Mac OS X/Darwin-ppc provides a full 32-bit address space to
...
applications (and a separate full 32-bit address space to the kernel).
...
Each transition from user to kernel (and back) requires an address
...
space change.  Linux (and others) take 1GB (or more) away from the
...
application address space so they can map the kernel as part of each
...
address space to avoid this overhead (they just have a privilege-level
...
transition to make instead).  Mac OS X  can't afford to take that much
...
address space away from some applications.  So, we pay a higher
...
transition cost.
ppc64 linux provides a fully 64-bit kernel with much more than the 4GB

of address space provided by MacOSX.  So its not loosing anything,

rather its gaining performance.  Not having to pay the price to swap

address spaces would be just one benefit moving to a real 64-bit

kernel ;)
...
We also use a single kernel for both SMP and UP systems.  The SMP
...
locking in the kernel imposes some additional overhead that isn't
...
always needed (or seen in those other OSes).
This was an SMP linux kernel.  I beleive there is at lease one lock in

the ioctl path.
...
These issues aside, there are still plenty of things we can (and are)
...
doing to improve those latencies.  But there always seem to have been
...
bigger performance fish to fry.
One quick way to speed up ioctls would be to hang a driver ioctl right

off of fops, and not force every ioctl to traverse the vnode layer.
...
--Jim
...

...
PS: When using an "averaging" benchmark like you describe, don't forget
...
to take those measurements in single-user mode, network disconnected.
...
Background activity plays an important part of the final result in such
...
cases.  Mac OS X, in full multi-user mode, tends to have a little bit
...
more of this than the typical OS (because of all the niceties like
...
Rendezvous, etc...) -  throwing the numbers off slightly (or sometimes
...
a lot).
I don't intend to use the system in single user mode with the network

disconnected, so I don't think benchmarking it like that is realistic.

I ran a huge number of iterations to factor background activity in.

Both systems were sitting at the login prompt (no gui) and this was

done via an ssh session.

BTW, you should see the jitter under MacOSX in our (OS-bypass, non-IP,

non ethernet) ping-pong latency benchmarks.  Huge jumps.  Linux is

stable run-to-run, MacOSX is all over the map.

I don't want to sound totally negative.  I'm not a linux biggot at all

(more of a BSD one) There are pleny of good things in MacOSX.  For

example, the MacOSX kernel is excellent at precise sleep times.  Its

totally amazing.  You tell it to wake up in 1ms, and it it bloody well

*does*.  You're lucky if you can get 20ms granularity in any other OS.

Kudos to whoever did the MacOSX sleep/wakeup code.

Drew

_______________________________________________

darwin-kernel mailing list | darwin-kernel@lists.apple.com

Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel

Do not post admin requests to the list. They will be ignored.