Traffic analysis & spotting

Traffic analysis & spotting

X. Wang, D. S. Reeves, and S. F. Wu, ‘Inter-Packet Delay Based Correlation for Tracing Encrypted Connections Through Stepping Stones’, in Proceedings of the 7th European Symposium on Research in Computer Security, London, UK, UK, 2002, pp. 244–263, Available at http://dl.acm.org/citation.cfm?id=646649.699363 Links to an external site..

Y. J. Pyun, Y. Park, D. S. Reeves, X. Wang, and P. Ning, ‘Interval-based flow watermarking for tracing interactive traffic’, Computer Networks, 56 (5):1646–1665, March 2012, DOI:10.1016/j.comnet.2012.01.017.

Scott E. Coull, “Trafic Analysis”, In H. van Tilborg and S. Jajodia (Eds.) Encyclopedia of Cryptography and Security (2nd Edition). Springer Publishing. 2011. pp.1311 - 1313. http://www.scottcoull.com/Traffic_Analysis.pdf Links to an external site. 


Slide Notes

Wang and D. S. Reeves, ‘Robust correlation of encrypted attack traffic through stepping stones by manipulation of interpacket delays’, CCS '03 Proceedings of the 10th ACM conference on Computer and communications security, 2003, pp. 20-29, DOI:10.1145/948109.948115, Available at http://portal.acm.org/citation.cfm?doid=948109.948115 Links to an external site..

Peng, P. Ning, D. S. Reeves, and X. Wang, ‘Active Timing-Based Correlation of Perturbed Traffic Flows with Chaff Packets’, in Proceedings of the Second International Workshop on Security in Distributed Computing Systems (SDCS) (ICDCSW’05) - Volume 02, Washington, DC, USA, 2005, pp. 107–113, DOI:10.1109/ICDCSW.2005.30, Available at http://dx.doi.org/10.1109/ICDCSW.2005.30 Links to an external site..

Young June Pyun, Young Hee Park, Douglas S. Reeves, Xinyuan Wang and Peng Ning. Interval-based Flow Watermarking for Tracing Interactive Traffic. In Computer Networks Journal, 56(5):1646-1665, March 2012. and other papers at: http://cs.gmu.edu/~xwangc/ Links to an external site.

Charles V. Wright, Fabian Monrose, and Gerald M. Masson,  "Towards better protocol identification using profile HMMs", JHU Technical Report JHU-SPAR051201, http://www.cs.jhu.edu/~cwright/hmm-techreport.pdf Links to an external site.

Charles V. Wright, Fabian Monrose, and Gerald M. Masson, "On Inferring Application Protocol Behaviors in Encrypted Network Traffic", JHU Technical Report JHU-SPAR060315, http://www.cs.jhu.edu/~cwright/hmm-techreport2.pdf Links to an external site.

He, P. Venkitasubramaniam, and L. Tong, ‘Packet Scheduling Against Stepping-stone Attacks with Chaff’, in Proceedings of the 2006 IEEE Conference on Military Communications, Piscataway, NJ, USA, 2006, pp. 356–362, Available at http://dl.acm.org/citation.cfm?id=1896579.1896634 Links to an external site..

He and L. Tong, ‘Detecting Information Flows: Improving Chaff Tolerance by Joint Detection’, presented at the 41st Annual Conference on Information Sciences and Systems, 2007. CISS ’07., 2007, pp. 51–56, DOI:10.1109/CISS.2007.4298272, Available at http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4298272 Links to an external site.


Transcript

[slide417] There's been lots of work done in this area about watermarking traffic. You know about spread spectrum coding? Anyone know about spread spectrum? Anyone doesn't know about spread spectrum coding? Okay. Spread spectrum coding basically takes the idea that I can apply a code that's very long, and even if there is noise, because the code is so powerful, I can recover from that noise. So if we look at wireless LAN, for instance, one of the approaches to wireless LAN is I took my traffic I wanted to send, I took a pseudo-random noise source, and I took each bit that I wanted to send, and I sent a very long code word, perhaps 128 bits or something like that. At the receiver, I had the same PRN running. I took the received data, and I XORed it against my codes, and now it comes out, hopefully, the bit that I put in. And the reason it works is I take one bit, and I output, let's say, 64 chips, as they're called. So one bit in, it generates this random-looking set of 64 bits. It turns out, and you can do the analysis, that this gives you enormous protection against errors, because a lot of these have to be in error to move the symbol that's being sent from a 1 to 0. All of this technology dates back to the time of World War II, and one of the people who invented it was an actress called Hedy Lamarr. And she invented it to be able to control torpedoes. And it's based upon the observation that she and a musician friend of hers had about player pianos, where you have these pianos, holes cut in the paper, air goes through the holes, it makes the hammer of the piano hit the strings, and it makes a sound. Well, it turns out that this gain that you get, statistical gain, means that you can put labels on packets that are very, very, very strong. What does that mean? I have a source sending here, and what do I do? I modulate the distance between the packets, i.e. their relative delays to each other, to encode the label I want to put on this stream of packets. The packets flow through some network to some destination over here. The packets arrive at the destination. Here's the media call. Here's the media call, but anywhere along here, if I sniff the traffic, I will see the label that I applied to those packets. And it turns out, even if I put those packets through mixers, the label is very likely to remain there. This is really, really cool mathematically, interesting technology. So, the disadvantage, however, is, yes, it comes with a big cost, because I need to modify a lot of packets over a long period of time to be able to attach a long label, but it's extremely powerful. And there are people who have been looking at applying it. Some time ago, there was a country in the Middle East that was having an election, and there was a lot of effort by the government to prevent communications from inside the country to people outside the country. And I had a student here who was interested in providing communications inside to outside the country of news regarding these political activities. And what he was doing was actually sending encrypted voice traffic as voice traffic so that reporters could talk to people outside the country and spread the news about what was going on. And initially he found that within a matter of minutes of time, each of his connections would get terminated. And then he went to ever stronger methods of trying to hide his traffic. And then he got to the point that weeks would go by before they'd be able to figure out his new scheme, and then they would cut the traffic. In the end, of course, the police showed up and physically arrested the people who were at the sites that he was communicating with, and off they went, and who knows what happened then. But this interest in being able to hide traffic is a very strong one, for political reasons, for espionage, for lots of other reasons. And there are many people who are interested in it. It's an interesting area.