DNS response CDF

plot (ecdf(data2$V1), xlab="time (seconds)", ylab="Cumulative probability", pch=20, cex=0.25)
grid()

 

ECDF plot of the data
ECDF plot of the data

 


Transcript

Now, we might also want to compute the cumulative distribution function. So now, we can call the empirical cumulative distribution function on our data.  You can specify our Z label, our Y label, which symbol we want to use, and off we go.  We also specified the size of the sample.  And we called grid to put in these background grid lines. So now using this plot it is very easy to see that, in fact, 80% of the time our delays were extremely small but then there was a little plateau here and then it climbs a bit more, and then we have another plateau, and then it slowly climbs up here until we're now up around 95% or so at 0.2 seconds or 200 milliseconds. So that means this is around 90% of the time we're below a tenth of a second and about 95% below two-tenths of a second. But the great thing we saw is that most of the queries were answered very fast - but we also have these ones out here -that were taking quite a long time to be found. And, of course, if you're interested in DNS, you can think about: "How can we perform this better?". So we can introduce DNS proxies, we can look at our caching times, etc.  But as we saw, it was very useful to be able to make the plot show the data that we want to show.