This blog post follows on from the one I wrote back in November 2019 that introduced the topic of measuring Office 365 network performance and focused on DNS optimization & best practice.
This blog post will focus on TCP best practices before in the next blog post moving onto capturing and analysing a TCP connection to Office 365 and comparing the information available in the packet capture to best practice for TCP.
TCP – Best Practices
Before we progress any further we need to talk a little about how TCP works – TCP is a connection orientated protocol meaning that the two endpoints in any TCP connection talk to each other to negotiate the connection and ensure that the packets are delivered. Both the data transfer rate, and the detection and resending of lost packets are managed.
Part of the delivery negotiation involves managing the rate at which data is sent from one end to the other – this is inversely proportional to the latency between the two ends and is managed dynamically. Because of this there is a formula that defines the maximum theoretical data transfer rate for any particular round trip time. This is:
Maximum throughput (in Bits Per Second) = TCP Window Size (in Bytes) * 8 bits / RTT in seconds
So for example, using a TCP window size of 64kB (I’ll discuss this more later) and a RTT of 20ms the maximum achievable throughput is 26.2Mbps. If the RTT doubles to 40ms, then the throughput halves to 13.1Mbps and so on.
TCP Round Trip Time
Because of the link between latency and throughput the maximum acceptable RTT is kind of based on what throughput you think would be unacceptable. For example, 100ms RTT would result in maximum theoretical throughput of 0.6Mbps which is pretty poor… I think if you want Office 365 content to render reasonably quickly you should probably be looking for typical RTT of less than 50ms in order for user experience to not be terrible. If you have a TCP Window Scaling Factor of 4 or 8 you may find a higher RTT to be acceptable.
TCP Window Size
What is TCP window size? The TCP window size is the amount of data a sender will send to a receiver without receiving an acknowledgement packet. The starting value of the window size is defined by the Microsoft Operating system based on the interface speed through which the TCP connection is established and for 100Mbps and 1Gbps interfaces is 64kB (kilobytes).
TCP Window size scaling factor
The TCP Window Size has a 2 byte header field meaning it can never exceed 65535 bytes or 64kB. In modern networks this has been deemed as inadequate so when TCP options were introduced in RFC 1323 one of the options made available was to enable the TCP receive window to have a multiplier applied to it. This multiplier is called the TCP Window Scaling factor and is fixed during the handshake process at the start of a TCP connection.
The scaling factor obviously has an impact on the maximum throughput of a TCP connection as it increases the number of bytes that can be sent before waiting for an acknowledgement.
The scaling factor works as a binary bit left shift on the TCP Window Size and has the effect of multiplying the Window Size by 2^Scaling_Factor. For example a scaling factor of 4 multiplies the window size by 16, a scaling factor of 8 multiplies the window size by 256.
Microsoft have implemented in their recent operating systems a default value for the scale factor of 8, and consider a value of either 4 (described as “restricted”) or 8 (described as “normal”) as acceptable.
The maximum TCP throughput with TCP Window Size Scaling taken into account is calculated as:
Maximum throughput (in Bits Per Second) = TCP Window Size (in Bytes) * Scaling Factor * 8 bits / latency in seconds
Thus with the default window size of 64kB, default scaling factor of 8, we have maximum throughputs of:
For 20ms latency: ( 64 * 1024 * 256 * 8 ) / ( 0.02 * 1024 * 1024 * 1024 ) = 6.25Gbps
For 40ms latency: ( 64 * 1024 * 256 * 8 ) / ( 0.04 * 1024 * 1024 * 1024 ) = 3.13Gbps
For 100ms latency: ( 64 * 1024 * 256 * 8 ) / ( 0.1 * 1024 * 1024 * 1024 ) = 1.25Gbps
Such a throughput clearly sounds ludicrous – especially when most WAN sites would only run at 100Mbps or 1Gbps at most. But it should be noted that this scaling factor is fixed at the start of the TCP connection and the window size itself is changed to fit the available bandwidth. So for a user with a 100Mbps circuit where the scaling factor is set by the MS OS to 8 the window size will inevitably end up much smaller – somewhere around 9.5kB for 100Mbps and 20ms.
TCP has a so called maximum segment size – this is the amount of data that can be placed into a TCP packet after the TCP header – so by definition it must be less than the IP MTU minus the IP header and TCP header size. On most networks the IP MTU is 1500 bytes, the IP header size is 20 bytes, the TCP header size is 20 bytes, so the MSS will most likely be 1460 bytes.
It could be less however – Cisco CAPWAP for example uses an MTU of 1300 bytes so the MSS for this would be 1260 bytes. Some providers of FTTC circuits use a lower MTU also.
But what effect does a lower MTU and consequential lower MSS have on TCP though? If I have two circuits each running at the same line rate but one where the MSS is 1460 bytes and the other the MSS is only 730 bytes then one circuit will transfer data twice as fast as the other.
So its important to check the MSS on a TCP connection and if its less than 1440 bytes to ensure you understand why and that the reason is valid.
Packet Loss & Duplicate Acknowledgement Packets
TCP does not start transmitting at the full line rate of the path between sender and receiver – rather it ramps up. It does this using a scalable congestion window that is initially set at 1 packet.
It first sends a single packet and waits for the acknowledgement, when this is received it increments the congestion window by 1 so it is set at two.
It then sends 2 packets and waits for the acknowledgement, then increases the congestion window to 4.
Assuming no packets are lost the congestion window increases in this manner until it reaches the receive window size.
When a packet acknowledgement is not received the TCP congestion window either restarts from its initial value or restarts from half its current value depending on the TCP implementation.
Consequently, packet loss is significant to TCP and slows throughput.
Packet loss can be determined for a TCP session by analysing the number of so called “Duplicate Acknowledgement” packets sent – a duplicate acknowledgement is sent by a TCP receiver when the packet it receives is not the next one it anticipated based on the TCP sequence numbers.
Ideally packet loss should be well below 1% – so to get an indication of this you can compare the number of duplicate acknowledgement packets with the total packets in the TCP connection over a given period. Note that data packets will be traveling in the opposite direction to the duplicate acknowledgement packets.
Packet Loss & Selective Acknowledgement Packets
Traditionally, upon loss of a packet, TCP would resend that packet and all packets that came after it not merely the lost packet. So for example if packets 1-100 were received correctly, packet 101 was lost, and packets 102-1000 were received correctly, then TCP would resend all packets from 101 to 1000. This was highly inefficient.
To fix this limitation RFC 2018 introduced selective acknowledgements – these allowed the receiver to acknowledge discontiguous blocks of packets that were correctly received without implying the lost packet had been received. This allows the sender to resend the lost packets without having to resend packets correctly received.
It is important to correctly interpret whether a duplicate acknowledgement is a genuine duplicate acknowledgement not a selective acknowledgement as the packets look similar (wireshark interprets both a duplicate acknowledgements by default).
Suppose you send 1000 packets, packet number 30 is lost, all other packets are received, then packet 30 is resent. Under this scenario, packets 31 through 1000 will all be acknowledged by a selective acknowledgement, if these are interpreted as duplicate acknowledgements you will believe you have 97% packet loss when in actual fact you have only got 0.1% packet loss.
I should explain at the end of this post that Microsoft don’t really provide robust well argued guidance on the required metrics for good performance for Office 365 – they don’t for example state anywhere that a RTT of less than 50ms is a requirement for accessing sharepoint. So much of what I have provided above is a mix of my opinions, stuff gleaned from Microsoft’s many guidance papers, and sitting through a couple of Microsoft-run O365 Network Assessments.
I have tried to be pragmatic – for example 10ms RTT for accessing sharepoint would be better than 50ms but would a user really notice the difference and how much would such a target cost to achieve? I personally don’t believe 10ms would be a financially viable target – and in all probability totally unachievable unless you were within 20-30 miles of the data centre hosting Office 365.