Re: Asynchronous sock_sendmbuf
Re: Asynchronous sock_sendmbuf
- Subject: Re: Asynchronous sock_sendmbuf
- From: Terry Lambert <email@hidden>
- Date: Fri, 23 May 2008 03:17:37 -0700
On May 22, 2008, at 7:10 PM, Eddy Quicksall wrote:
Maybe the efficiency would still be fairly good but one thing is
four sure.
It would take lots more memory to have 1000 or so threads just to
handle
what can be handled with a single thread.
Regarding upper level recovery ... any networking software must be
aware
that the connection can be lost and any time. It is not the
responsibility
of the lower layers to recover from a lost connection. For good
upper layer
protocols, this is built into them. For example, iSCSI has lots of
mechanisms to deal with this.
I'll look into sototcpcb(so). Thanks for that tip.
Regarding " not published API", if I don't want to use extra threads
and I
don't want to poll and I don't want to check every few ms, how would
you
suggest that I implement non-blocking socket calls?
Hard to answer specifically, given you haven't shared much about the
problem you are trying to solve.
One way would be to queue the requests to a pool of one or more worker
threads that make blocking calls.
You realize that blocking only happens if your send queue on a socket
exceeds the amount of data allowed to be pending send on a given
socket at a time, right? It's not like "can block" means "might block
for no reason at all just to teach you to not use blocking calls".
Blocking happens for a reason, and if you avoid giving it the reason,
it simply does what you tell it without blocking.
If you have an upper level recovery protocol, you can use the acks
there to
pace your output to the sockets, and thereby guarantee that you will
never fill a given sockets send queue. One technique that works well
here is called "rate halving".
If you think you will occasionally fill a send queue because your
upper level protocol isn't that smart, you can pick a number out of
the 1000 where you think that is probable, and add one.
At that point, you marshall your writes using queues to a pool of that
many work-to-do threads, and they do the blocking writes on your
behalf. Because there are only ever N of these, with N+1 threads, you
always have one available to do work, so there's no starvation.
You could allocate worker threads on demand; this assumes you know for
a fact N won't get large; you'd probably want it adminstratively bound
at an upper limit anyway.
Another alternative is to use timers to interrupt blocked sends after
a while.
Realize, though, that if full send queues are the rule rather than the
exception, your protocol design is probably flawed.
If you can tolerate losses/gaps, you should consider RED queuing.
Random Early Drop let's you avoid doing work which will ultimately be
unable to be completed before processing it to that point through an
expensive-to-run stack of software. The cheapest work is work you
don't do.
In the limit, using upcalls to retry pending sends that are pending
because the last attempt got EWOULDBLOCK is just a poor man's
implementation of polling using opportunistic timers, rather than
explicit timers. This has the disadvantage of you taking "lbolt"s to
run what is, in effect, a soft interrupt (netisr-style) polling for
available unsent data. This happens, even if you have no work to do,
and so it's just useless overhead. If your steady-state is not "full
send queues with even more data pending enqueuing", then these will
mostly have no work to do. If so, you are better off using a non-
periodic oneshot timer, which has the advantage of you being able to
only turn it on if you know you have data in this state. A more
sophisticated program would order a timer list in ascending expiration
order, and include expectation latency in calculating the expire time
(e.g. by keeping a per connection moving average and using that to
calculate a "retry this send after I expect the send to not fail,
based on this clients past history").
Also, since no one has bothered mentioning it, doing large data
transfers in your upcalls adds latency. So does doing large data
transfers in what is supposed to be your hot path. So using a quick
hand off of the data transfer to another thread eliminates latency (a
send is essentially a request to address a message, then copy the data
from your buffers to mbufs, and stuff the mbuf chain on a socket send
queue). So more threads don't equal bad performance, just like they
don't mean good performance, either: they are just a tool, best if
used appropriately (like the timers, in the previous example).
Ultimately, it comes down to knowing your problem space and knowing
appropriate algorithms for mapping code to that space effectively.
Finally, you could also do what you want in user space, where there
are a lot of useful tools, like kqueue, that already report the type
of events you are interested in, but which are simply not available in
kernel space. Being in the kernel doesn't automatically mean faster;
you will get a much bigger win for anything doing sustained
communications from reducing latency and amortizing fixed costs over
as many units as possible, rather than taking a per-unit hit for a cost.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden