Re: Efficient UDP processing
Re: Efficient UDP processing
- Subject: Re: Efficient UDP processing
- From: Glenn Anderson <email@hidden>
- Date: Sun, 22 Jan 2006 17:05:13 +1300
At 6:03 pm -0800 21/01/2006, Laurence Flath wrote:
Hello,
I'm attempting to write a GigE Vision 'driver'. GEV is a relatively
new standard for transferring digital video data over ethernet using
IP protocols. While it is not the most efficient transport medium,
it is (compared to other machine vision / scientific camera
solutions) significantly lower in both cost and hardware
complexity;
e.g. everyone has a 'frame grabber' built-in, cables are dirt-cheap, etc.
GEV data transport is via UDP; the majority of packets must be
processed thus:
1. Wait for next packet
2. Confirm it's an image data packet
3. Calculate what part of image the data came from (no in-order
guarantee in UDP)
4. Move the data into a memory buffer (user-space malloc'd)
5. Go to 1.
For GigE, MTU = 1500B, the above must be accomplished ~ 10usec to keep up.
Unfortunately, I'm having a difficult time on my PBG4 (1GHz) even
keeping-up with 100Base-T. My program is currently in user space,
using the standard socket APIs.
How exactly are you using the standard socket APIs?
Can anyone give me any guidance as to how to efficiently perform the
above operations?
I believe the fastest way to receive UDP packets is to create a
preemptive thread (with either the pthread or MPThread APIs)
specifically for the socket, and have that thread sit in a loop
calling recv (or recvfrom). If you have other compute intensive
things going on, such as processing of the video, increase the
priority of the UDP thread.
Do I need to packet-filter in a NKE? Are there any ways to DMA the
packet's payload into said buffer? This seems a lot like writing a
VNC client ... except that I can't afford to lose any packets.
Probably overkill, not to mention more complicated. While using
sockets you will end up with two copies going on, one for the recv,
and another to copy to the memory buffer, G4 class machines have more
than enough memory bandwidth and CPU horsepower to deal with
100MB/sec worth of data.
If you need to squeeze a bit more performance out, you might be able
to optimize things so that the buffer you use for recv is aligned
such that the data you are copying out in to the memory buffer is
aligned to a cache line, or at least 16 bytes. If the protocol makes
any effort to send nicely sized chunks (eg: multiples of 16 bytes)
you may be able to do an AltiVec or SSE optimized copy.
Glenn.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Macnetworkprog mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden