
The Internet is tremendously diverse. No single site or patch is typical or representative.
So, we do the same types of measurements to/from a large number of sites, generating N-squared paths over which measurement can be done.
NPD runs a variety of requests.
This study uses TCP, which is real-word traffic, uses fine-grained time scales and has congestion control. However, TCP sessions are complex, and can self-interfere (but ACKs don't).
There are about 35 sites in North America, and that generated 24*24 paths. There are two studies and 21 sites are common between the two studies. 730,000 packets in the Dec 1994 study and 4.85 million packets in November and December 1995. Data is analyzed automatically.
It is very important to understand the cause & effect: RTTs, measurement drops and unusual network behavior. The analysis program understand about 9 different TCPs. All but Solaris and Linux are Tahoe/Reno variants. It can also deal with checksum errors.
If the whole world was Trumpet/Winsock or Linux 1.0, Metcalf would be right. Independent TCP implementation is really hard, but vital to Internet stability. There is problem in Windows 95 and NT with retransmission, but it is really only a problem for those systems it is running on, not for the network.
According to the stats, it appears that the biggest locus of loss is traffic from US into Europe, though all regions showed an increase in loss.
It is clear from this work that if a packet is lost, then it is very likely (close to 50% of the time) that the packet preceding it along that same path was also lost.
It appears that fixing timers on TCP and deploying SAC would stop most of the redundant retransmissions. It would only affect a very small number (2%) of the cases measured in this study.
One-Way Transit Time (OTT): OTT is more useful than RTT when measuring network dynamics.
If you eliminate bad clocks, reordered packets, compressed packets and traces with TTL shifts, then argue that remaining OTT variation reflects queuing/congestion. Look for the time scale with the most variation.
The timescale where activity is most noticed between .1 and 1 seconds, but the range is still large.
It appears that most of the time the available bandwidth is small.
Allen Hannan: It seems that the stats from the first study are higher than in the second study? Vern will address that as we go.
Bill Norton: What sort of instrumentation would we like to have in the Internet? Vern wants to be sure that we are providing an archive of information about network performance that will give a good and recent indication of how the network works.
Curtis Villamizar: How does the RTT relate to the times of high variation? I don't know. It's a great question.
Allan Hannan: What about RED? RED will "help a lot" by reducing the loss outages. It should decrease a burst of loss.