RE: Asynchronous sock_sendmbuf
RE: Asynchronous sock_sendmbuf
- Subject: RE: Asynchronous sock_sendmbuf
- From: "Eddy Quicksall" <email@hidden>
- Date: Fri, 23 May 2008 09:51:53 -0400
Sorry, Terry ... I never really answered your question regarding my problem.
I'm porting some code that was written for Windows. That code uses WSA
Sockets. With WSASend, you specify a callback that will occur "when the send
operation has been completed" ... by "completed" it really only needs to
mean everything has been copied to internal buffers.
The upper level code is totally system and transport independent so I can't
modify it. So I simulate WSA Sockets completely using so_send/so_recv and
upcalls. All different operating systems will use a different simulation
routine.
Everything runs in a single thread. That thread will sleep when there is no
I/O and the upcall will wake that up if it needs to. Technically I guess the
upcall may be a thread but it is not one of my threads. I understand the
technique of making other worker threads but that has its limits due to a
high number of potential connections.
I have this mostly working now but still need to understand more about the
so_send/so_recv/up_call stuff to be sure. If there is a system upgrade or
difference in other BSD implementations then that is acceptable because all
of the changes will be isolated into one source file (the file that does the
actual socket calls).
I know that Apple enthusiasts don't want to recognize Microsoft so I don't
blame you if you don't want to help ... but this software must run on an
Apple (I'm starting with BSD at this moment, however).
Eddy
-----Original Message-----
From: Terry Lambert [mailto:email@hidden]
Sent: Friday, May 23, 2008 6:18 AM
To: Eddy Quicksall
Cc: Igor Mikushkin; email@hidden
Subject: Re: Asynchronous sock_sendmbuf
On May 22, 2008, at 7:10 PM, Eddy Quicksall wrote:
> Maybe the efficiency would still be fairly good but one thing is
> four sure.
> It would take lots more memory to have 1000 or so threads just to
> handle
> what can be handled with a single thread.
>
> Regarding upper level recovery ... any networking software must be
> aware
> that the connection can be lost and any time. It is not the
> responsibility
> of the lower layers to recover from a lost connection. For good
> upper layer
> protocols, this is built into them. For example, iSCSI has lots of
> mechanisms to deal with this.
>
> I'll look into sototcpcb(so). Thanks for that tip.
>
> Regarding " not published API", if I don't want to use extra threads
> and I
> don't want to poll and I don't want to check every few ms, how would
> you
> suggest that I implement non-blocking socket calls?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>>
>>>
>>
Hard to answer specifically, given you haven't shared much about the
problem you are trying to solve.
One way would be to queue the requests to a pool of one or more worker
threads that make blocking calls.
You realize that blocking only happens if your send queue on a socket
exceeds the amount of data allowed to be pending send on a given
socket at a time, right? It's not like "can block" means "might block
for no reason at all just to teach you to not use blocking calls".
Blocking happens for a reason, and if you avoid giving it the reason,
it simply does what you tell it without blocking.
If you have an upper level recovery protocol, you can use the acks
there to
pace your output to the sockets, and thereby guarantee that you will
never fill a given sockets send queue. One technique that works well
here is called "rate halving".
If you think you will occasionally fill a send queue because your
upper level protocol isn't that smart, you can pick a number out of
the 1000 where you think that is probable, and add one.
At that point, you marshall your writes using queues to a pool of that
many work-to-do threads, and they do the blocking writes on your
behalf. Because there are only ever N of these, with N+1 threads, you
always have one available to do work, so there's no starvation.
You could allocate worker threads on demand; this assumes you know for
a fact N won't get large; you'd probably want it adminstratively bound
at an upper limit anyway.
Another alternative is to use timers to interrupt blocked sends after
a while.
Realize, though, that if full send queues are the rule rather than the
exception, your protocol design is probably flawed.
If you can tolerate losses/gaps, you should consider RED queuing.
Random Early Drop let's you avoid doing work which will ultimately be
unable to be completed before processing it to that point through an
expensive-to-run stack of software. The cheapest work is work you
don't do.
In the limit, using upcalls to retry pending sends that are pending
because the last attempt got EWOULDBLOCK is just a poor man's
implementation of polling using opportunistic timers, rather than
explicit timers. This has the disadvantage of you taking "lbolt"s to
run what is, in effect, a soft interrupt (netisr-style) polling for
available unsent data. This happens, even if you have no work to do,
and so it's just useless overhead. If your steady-state is not "full
send queues with even more data pending enqueuing", then these will
mostly have no work to do. If so, you are better off using a non-
periodic oneshot timer, which has the advantage of you being able to
only turn it on if you know you have data in this state. A more
sophisticated program would order a timer list in ascending expiration
order, and include expectation latency in calculating the expire time
(e.g. by keeping a per connection moving average and using that to
calculate a "retry this send after I expect the send to not fail,
based on this clients past history").
Also, since no one has bothered mentioning it, doing large data
transfers in your upcalls adds latency. So does doing large data
transfers in what is supposed to be your hot path. So using a quick
hand off of the data transfer to another thread eliminates latency (a
send is essentially a request to address a message, then copy the data
from your buffers to mbufs, and stuff the mbuf chain on a socket send
queue). So more threads don't equal bad performance, just like they
don't mean good performance, either: they are just a tool, best if
used appropriately (like the timers, in the previous example).
Ultimately, it comes down to knowing your problem space and knowing
appropriate algorithms for mapping code to that space effectively.
Finally, you could also do what you want in user space, where there
are a lot of useful tools, like kqueue, that already report the type
of events you are interested in, but which are simply not available in
kernel space. Being in the kernel doesn't automatically mean faster;
you will get a much bigger win for anything doing sustained
communications from reducing latency and amortizing fixed costs over
as many units as possible, rather than taking a per-unit hit for a cost.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden