Network stack/ethernet driver issues
Network stack/ethernet driver issues
- Subject: Network stack/ethernet driver issues
- From: email@hidden
- Date: Wed, 9 Apr 2008 14:09:13 -0700 (PDT)
- Importance: Normal
Greetings,
Here's my situation: I have a custom piece of data logging hardware that
generates about 50MB/sec of data and sends it out over a gigabit ethernet
link as a steady stream of UDP packets (1k packet every ~20 microseconds,
very periodic, not bursty at all) broadcasted to the local subnet. I want
to log all this data to disk for offline analysis. As a first step I
wrote a program to just listen for the UDP packets and verify that they
were all being received. This is easy to do because each packet has a 16
bit sequence ID in it. Much to my dismay, I was unable to reliably
achieve this. I am running 10.5.2 on a 2.33 gHz MacBook Pro with 2G of
memory. After making SO_RCVBUF big enough, I was still occasionally
missing a block of 100+ packets at a time. Looking in the system.log file
I realized that each dropout was correlated to a message like:
Apr 9 13:38:51 Macintosh-6 kernel[0]: AppleYukon2: 00000000,00000000
skgehw - cppSkDrvEvent - SK_DRV_RX_OVERFLOW: rcv fifo overflow
Apr 9 13:38:51 Macintosh-6 kernel[0]: AppleYukon2: 00000000,00000000 sky2
- RX ring overflow -- dropped a packet
and occasionally
Apr 9 13:36:13 Macintosh-6 kernel[0]: AppleYukon2: 00000000,00000000 sky2
- RX ring overflow -- dropped a packet
Apr 9 13:36:13 Macintosh-6 kernel[0]: AppleYukon2: 00000000,00000001
sk98osx sky2 - - sk98osx_sky2::replaceOrCopyPacket tried N times
something else I should mention - just connecting my data logging hardware
to the MacBook Pro causes a massive spike in CPU use (even when there's no
program listening for the packets)
with no listener running top reports
0-1% user, 49% system, 50% idle
with the listener program running:
4% user, 85% system, 11% idle
i decided to run Shark to see where the CPU time was going. here's the
Time Profile of Everything for the first case (no listener)
55.4% 55.4% mach_kernel machine_idle
13.1% 13.1% mach_kernel ml_set_interrupts_enabled
7.9% 7.9% com.apple.iokit.AppleYukon2 yukon2osx::GetInterruptSource(unsigned
long*)
3.4% 3.4% com.apple.iokit.AppleYukon2 yukon2osx::SkY2Isr(IOInterruptEventSource*,
unsigned)
2.1% 2.1% mach_kernel lck_mtx_lock
1.9% 1.9% mach_kernel lck_mtx_unlock
1.9% 1.9% mach_kernel ifnet_input
1.9% 1.9% com.apple.iokit.AppleYukon2 yukon2osx::PrcocessInterruptSource(IOInterruptEventSource*,
int)
1.2% 1.2% mach_kernel udp_input
1.0% 1.0% mach_kernel OSCompareAndSwap
1.0% 1.0% mach_kernel in_pcb_checkstate
0.7% 0.7% mach_kernel _mutex_assert
0.6% 0.6% mach_kernel ip_input
0.4% 0.4% mach_kernel udp_lock
0.4% 0.4% mach_kernel OSAddAtomic
0.4% 0.4% mach_kernel wait_queue_wakeup_all
0.3% 0.3% com.apple.iokit.AppleYukon2 yukon2osx::HandleReceives(int,
unsigned short, unsigned, unsigned short, unsigned short, unsigned,
unsigned short)
0.3% 0.3% mach_kernel cantrace
0.3% 0.3% mach_kernel IOWorkLoop::threadMain()
0.3% 0.3% mach_kernel udp_unlock
0.2% 0.2% mach_kernel IOWorkLoop::runEventSources()
0.2% 0.2% mach_kernel wait_queue_assert_wait
0.2% 0.2% mach_kernel ether_demux
0.2% 0.2% mach_kernel IORecursiveLockLock
0.2% 0.2% mach_kernel uiomove
0.2% 0.2% mach_kernel proto_input
0.1% 0.1% mach_kernel in_broadcast
and here's the output for the second case (listener running)
22.2% 22.2% mach_kernel ml_set_interrupts_enabled
21.2% 21.2% mach_kernel machine_idle
4.7% 4.7% com.apple.iokit.AppleYukon2 yukon2osx::GetInterruptSource(unsigned
long*)
4.6% 4.6% com.apple.iokit.AppleYukon2 yukon2osx::PrcocessInterruptSource(IOInterruptEventSource*,
int)
4.6% 4.6% com.apple.iokit.AppleYukon2 yukon2osx::SkY2Isr(IOInterruptEventSource*,
unsigned)
3.1% 3.1% mach_kernel lck_mtx_lock
2.6% 2.6% mach_kernel lck_mtx_unlock
1.7% 1.7% mach_kernel OSAddAtomic
1.4% 1.4% mach_kernel ifnet_input
1.3% 1.3% mach_kernel copyout_kern
1.2% 1.2% mach_kernel soreceive
1.1% 1.1% com.apple.iokit.AppleYukon2 yukon2osx::HandleReceives(int,
unsigned short, unsigned, unsigned short, unsigned short, unsigned,
unsigned short)
1.1% 1.1% mach_kernel _rtc _nanotime_read
1.0% 1.0% libSystem.B.dylib recvfrom$NOCANCEL$UNIX2003
0.9% 0.9% mach_kernel lo_unix_scall
0.8% 0.8% mach_kernel udp_input
0.8% 0.8% mach_kernel _mutex_assert
0.8% 0.8% mach_kernel ip_input
0.6% 0.6% mach_kernel wait_queue_assert_wait
0.6% 0.6% mach_kernel nval_copy_windows
0.5% 0.5% mach_kernel wait_queue_wakeup_all
0.5% 0.5% mach_kernel m_mclget
0.5% 0.5% mach_kernel uiomove
0.5% 0.5% mach_kernel sbappendaddr
0.5% 0.5% mach_kernel lck_rw_lock_shared
0.5% 0.5% mach_kernel cantrace
0.4% 0.4% mach_kernel udp_lock
0.4% 0.4% com.apple.iokit.AppleYukon2 yukon2osx::HandleStatusLEs()
0.4% 0.4% mach_kernel lck_mtx_lock_spinwait
0.4% 0.4% com.apple.iokit.AppleYukon2 yukon2osx::GiveRxBufferToHw(unsigned,
int, s_packet*)
0.4% 0.4% mach_kernel blkclr
0.4% 0.4% mach_kernel socketpair
0.4% 0.4% mach_kernel udp_unlock
0.4% 0.4% mach_kernel zalloc_canblock
0.4% 0.4% mach_kernel IOWorkLoop::runEventSources()
0.4% 0.4% mach_kernel uiomove64
0.3% 0.3% mach_kernel unix_syscall
0.3% 0.3% libSystem.B.dylib recvfrom
Ultimately I was unable to reliably receive all the packets the device was
putting over the ethernet link due to the rx ring buffer overflowing.
These overflows happen a few times a minute, regardless of whether or not
an application is actually listening for the packets.
Since I have Linux running on the same machine, I decided to try
recompiling and running the same test program under Linux 2.6.24. Under
Linux my test program receives every packet reliably and doesn't stress
the machine out. For comparison, here is what top reports:
with device attached, no listener
0.5% user, 0% system, 84% idle, 5% hardware interrupt, 4% software interrupt
with the listener program running:
1% user, 3% system, 82% idle, 5% hardware interrupt, 9% software interrupt
so it seems to me that something is screwy with the macos networking
stack. It wasn't a big deal for me as I was able to get my data logging
software running reliably under Linux but it doesn't seem right that macos
can't keep up with a 50MB/sec network load (only about 40% of the
available bandwidth on a gigabit ethernet link). Is this a known issue ?
Should I file a bug report ?
thanks
-rimas
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden