RE: Asynchronous sock_sendmbuf
RE: Asynchronous sock_sendmbuf
- Subject: RE: Asynchronous sock_sendmbuf
- From: "Eddy Quicksall" <email@hidden>
- Date: Fri, 23 May 2008 14:25:29 -0400
Thanks for the input.
I'm not using sock_sendmbuf, I'm using so_send.
I already invoke the callback if so_send shows that everything was copied.
But if everything was not copied then I can't invoke the callback. I have
found that I get an upcall when the tcp ACK's arrive. At that time I check
to see if the ACK belongs to the most recent outstanding send. If so I go
back to my main thread where I invoke the callback.
Eddy
-----Original Message-----
From: Vincent Lubet [mailto:email@hidden]
Sent: Friday, May 23, 2008 12:22 PM
To: Eddy Quicksall
Cc: darwinKernel Dev
Subject: Re: Asynchronous sock_sendmbuf
Eddy,
There is no such upcall for when data has been copied into the
internal buffer. If the call to sock_sendmbuf() -- or sock_send() --
succeeds then you can be sure the data has been copied. If you really
need to emulate this WSA callback, why not simply invoke the callback
from the point sock_sendmbuf() returns?
Vincent
On May 23, 2008, at 6:51 AM, Eddy Quicksall wrote:
> Sorry, Terry ... I never really answered your question regarding my
> problem.
>
> I'm porting some code that was written for Windows. That code uses WSA
> Sockets. With WSASend, you specify a callback that will occur "when
> the send
> operation has been completed" ... by "completed" it really only
> needs to
> mean everything has been copied to internal buffers.
>
> The upper level code is totally system and transport independent so
> I can't
> modify it. So I simulate WSA Sockets completely using so_send/
> so_recv and
> upcalls. All different operating systems will use a different
> simulation
> routine.
>
> Everything runs in a single thread. That thread will sleep when
> there is no
> I/O and the upcall will wake that up if it needs to. Technically I
> guess the
> upcall may be a thread but it is not one of my threads. I understand
> the
> technique of making other worker threads but that has its limits due
> to a
> high number of potential connections.
>
> I have this mostly working now but still need to understand more
> about the
> so_send/so_recv/up_call stuff to be sure. If there is a system
> upgrade or
> difference in other BSD implementations then that is acceptable
> because all
> of the changes will be isolated into one source file (the file that
> does the
> actual socket calls).
>
> I know that Apple enthusiasts don't want to recognize Microsoft so I
> don't
> blame you if you don't want to help ... but this software must run
> on an
> Apple (I'm starting with BSD at this moment, however).
>
> Eddy
>
> -----Original Message-----
> From: Terry Lambert [mailto:email@hidden]
> Sent: Friday, May 23, 2008 6:18 AM
> To: Eddy Quicksall
> Cc: Igor Mikushkin; email@hidden
> Subject: Re: Asynchronous sock_sendmbuf
>
> On May 22, 2008, at 7:10 PM, Eddy Quicksall wrote:
>> Maybe the efficiency would still be fairly good but one thing is
>> four sure.
>> It would take lots more memory to have 1000 or so threads just to
>> handle
>> what can be handled with a single thread.
>>
>> Regarding upper level recovery ... any networking software must be
>> aware
>> that the connection can be lost and any time. It is not the
>> responsibility
>> of the lower layers to recover from a lost connection. For good
>> upper layer
>> protocols, this is built into them. For example, iSCSI has lots of
>> mechanisms to deal with this.
>>
>> I'll look into sototcpcb(so). Thanks for that tip.
>>
>> Regarding " not published API", if I don't want to use extra threads
>> and I
>> don't want to poll and I don't want to check every few ms, how would
>> you
>> suggest that I implement non-blocking socket calls?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>>
>>>>
>>>
>
> Hard to answer specifically, given you haven't shared much about the
> problem you are trying to solve.
>
> One way would be to queue the requests to a pool of one or more worker
> threads that make blocking calls.
>
> You realize that blocking only happens if your send queue on a socket
> exceeds the amount of data allowed to be pending send on a given
> socket at a time, right? It's not like "can block" means "might block
> for no reason at all just to teach you to not use blocking calls".
> Blocking happens for a reason, and if you avoid giving it the reason,
> it simply does what you tell it without blocking.
>
> If you have an upper level recovery protocol, you can use the acks
> there to
> pace your output to the sockets, and thereby guarantee that you will
> never fill a given sockets send queue. One technique that works well
> here is called "rate halving".
>
> If you think you will occasionally fill a send queue because your
> upper level protocol isn't that smart, you can pick a number out of
> the 1000 where you think that is probable, and add one.
>
> At that point, you marshall your writes using queues to a pool of that
> many work-to-do threads, and they do the blocking writes on your
> behalf. Because there are only ever N of these, with N+1 threads, you
> always have one available to do work, so there's no starvation.
>
> You could allocate worker threads on demand; this assumes you know for
> a fact N won't get large; you'd probably want it adminstratively bound
> at an upper limit anyway.
>
> Another alternative is to use timers to interrupt blocked sends after
> a while.
>
> Realize, though, that if full send queues are the rule rather than the
> exception, your protocol design is probably flawed.
>
> If you can tolerate losses/gaps, you should consider RED queuing.
> Random Early Drop let's you avoid doing work which will ultimately be
> unable to be completed before processing it to that point through an
> expensive-to-run stack of software. The cheapest work is work you
> don't do.
>
> In the limit, using upcalls to retry pending sends that are pending
> because the last attempt got EWOULDBLOCK is just a poor man's
> implementation of polling using opportunistic timers, rather than
> explicit timers. This has the disadvantage of you taking "lbolt"s to
> run what is, in effect, a soft interrupt (netisr-style) polling for
> available unsent data. This happens, even if you have no work to do,
> and so it's just useless overhead. If your steady-state is not "full
> send queues with even more data pending enqueuing", then these will
> mostly have no work to do. If so, you are better off using a non-
> periodic oneshot timer, which has the advantage of you being able to
> only turn it on if you know you have data in this state. A more
> sophisticated program would order a timer list in ascending expiration
> order, and include expectation latency in calculating the expire time
> (e.g. by keeping a per connection moving average and using that to
> calculate a "retry this send after I expect the send to not fail,
> based on this clients past history").
>
> Also, since no one has bothered mentioning it, doing large data
> transfers in your upcalls adds latency. So does doing large data
> transfers in what is supposed to be your hot path. So using a quick
> hand off of the data transfer to another thread eliminates latency (a
> send is essentially a request to address a message, then copy the data
> from your buffers to mbufs, and stuff the mbuf chain on a socket send
> queue). So more threads don't equal bad performance, just like they
> don't mean good performance, either: they are just a tool, best if
> used appropriately (like the timers, in the previous example).
>
> Ultimately, it comes down to knowing your problem space and knowing
> appropriate algorithms for mapping code to that space effectively.
>
> Finally, you could also do what you want in user space, where there
> are a lot of useful tools, like kqueue, that already report the type
> of events you are interested in, but which are simply not available in
> kernel space. Being in the kernel doesn't automatically mean faster;
> you will get a much bigger win for anything doing sustained
> communications from reducing latency and amortizing fixed costs over
> as many units as possible, rather than taking a per-unit hit for a
> cost.
>
> -- Terry
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Darwin-kernel mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
> @apple.com
>
> This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden