Re: Reducing latency in TCP-based streams
Re: Reducing latency in TCP-based streams
- Subject: Re: Reducing latency in TCP-based streams
- From: Marshall Eubanks <email@hidden>
- Date: Wed, 11 Mar 2009 08:18:51 -0400
Hello;
On Mar 11, 2009, at 7:20 AM, Thomas Tempelmann wrote:
I am writing an app which is time-critical in transmitting small
amounts of data between two peers over the Internet (i.e. WAN, not
just a local network).
I am trying to understand the latencies that may be involved.
I think you need to dig into this more than you are going to get from
a mail list. If you are going to use TCP, you will need to understand
it better.
I need some clarifications on how the TCP network stack on OS X
packages data going in and out. I am looking to optimize my code so
that it caues the least possible latency.
I think it is the BSD stack, and quite standards compliant.
First, I understand that sending and receiving UDP datagrams is very
straight-forward: If I create a datagram and send it out, it gets
into the I/O queue immediately and gets sent out as one packet over,
for instance, Ethernet. No delays involved, if the queue is
otherwise empty. Correct?
All of those steps take time, and there may be hardware buffers in,
e.g., the NIC card. Everything induces a delay. I suspect you mean
queuing, which is a different matter.
For TCP there are extra packets being sent (for example, 3 packets to
set up a socket), so even if you are the only source of traffic your
data packets may be queued.
Run tcpdump and look at the traffic being generated even when you are
not doing anything. You can never assume that your packets will not be
queued, and you should not assume that packet delays are fixed or
predictable.
For TCP, though, I assume this works completely different: Since the
APIs we get to use in an app (i.e. CFStreams, or BSD sockets or
whatever they're called), see the data as a unpackaged stream, and
since TCP is most effective if it uses Ethernet packets at their
full capacity, there's some delaying and collecting going on before
data ends up on the Ethernet and beyond.
I try to understand those collection mechanisms in order to see if I
need to adjust for them.
There are several good books about TCP and networking...
This all assumes that the amout of data per time is still far below
the capacity of the network, i.e. delivery is generally possible
without congestion.
Here's an example:
Let's assume there's no outgoing network data in queue.
Now I send 2000 bytes to the outgoing network stream.
An Ethernet packet takes roughly 1500 bytes.
You need to worry about and avoid fragmentation, which will slow data.
You cannot trust path MTU discovery being available. So, you need to
set an MTU.
The IP header is 20 bytes (+ options)
The TCP header is 20 bytes (+ options)
The UDP header is 8 bytes
The standard data MTU for Ethernet is 1500 bytes. However, for IP that
is the total packet size, and some of that has to go to your IP
headers, so you don't get to use 1500 bytes for your actual data.
Worse, if your packets go through tunnels then extra headers are added
and even a 1500 byte packet will have to be fragmented by the tunnel,
which also can add latency. I generally recommend 1450 bytes for the
data MTU for UDP to be safe.
This assumes IPv4 by the way - IPv6 headers are bigger.
Hence, a first packet with the first 1500 bytes from my stream could
be sent out immediately.
For the rest, 500 bytes, some layer might decide to hold off sending
it out because there might be more coming, to fill up another
Ethernet packet. But after some waiting, it'll surely decide that
it's now time to send it out despite it being a smaller then optimal
packet.
Is that basically how it works?
If so, I wonder if I can assume that the network stack is able to
learn once the physical layer is idle, and then sends out the small
(500 byte) packet without further waiting, as it would not pay off
waiting any longer as there's now an open window for data anyway
(let's ignore potential Ethernet collisions here).
You can force that. An application can force delivery of segments to
the output stream using the push operation provided by TCP and setting
the PSH flag to ensure that data will be delivered immediately to the
application layer by the receiving transport layer. This is how telnet
sends single keystrokes, for example.
Or will there be a waiting period in any case, thus sending out the
500 bytes later than if I had stuffed it up with another 1000 bytes
right away so that it gets sent without delay?
That's basically what worries me: Can I decrease latency by making
sure I stuff more data into the stream to ensure prompt delivery of
my actual payload?
In a simpler example:
What if I would only send 10 bytes onces every second, but I want
these 10 bytes delivered ASAP, using TCP not UDP.
You haven't really said anything about what you are trying to do, but
if I were you I would consider using UDP with RTP to provide for
sequence numbers on the data, so the application at the far end would
know if data were dropped. If your data sizes are small, you should
consider RTP header compression as well.
In general, don't reinvent the wheel, especially if you are not sure
what the wheel is actually doing.
Regards
Marshall
--
Thomas Tempelmann, http://www.tempel.org/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Macnetworkprog mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Macnetworkprog mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden