Exploring Simple Detection Techniques for DNS-over-HTTPS Tunnels
Carmen Kwan, Paul Janiszewski, Shela Qiu, Cathy Wang, Cecylia Bocovich
https://dl.acm.org/doi/10.1145/3473604.3474563
The paper explores ways to distinguish the DNS over HTTPS traffic of a DNS tunnel (namely dnstt) from ordinary browser-generated DNS over HTTPS traffic. Even though DNS over HTTPS (DoH) is encrypted, censors may be able to infer the use of a tunnel by looking at side-channel features like traffic timing and volume. The authors of this paper build data sets of both circumvention and non-circumvention DoH traffic, using Selenium to drive Firefox to Alexa global top sites. The non-circumventor data set captures the DoH produced by Firefox while visiting sites. The circumventor data set captures all the traffic of a Firefox which is configured to use dnstt as a proxy (so it contains not only the browser’s DNS queries and responses, but also the tunneled contents of the sites). Analysis of these two data sets turns up three traffic features—average payload length, packet rate (packets/s), and throughput (bytes/s)—and thresholds that distinguish dnstt from browser DoH with nearly 100% precision and 70–80% recall. To give an example of a feature threshold, over a short time window, only 1% of browser DoH has an average packet length of more than 70 bytes; but 56% of dnstt DoH does. The tests require observation of a few hundred or thousand packets before declaring a detection result.
Having observed that dnstt is distinguishable by its use of large packets and high data rates, the authors modify dnstt to diminish these signals, imposing a rate limit of 500 packets/s in both directions, and a downstream data capacity per packet of 100 bytes. (Packets on the wire will actually be bigger than 100 bytes because of DNS encoding, HTTP, and TLS overhead.) The modified dnstt successfully avoids detection attacks based on the average payload length feature, but remains vulnerable in the packet rate and throughput features. The authors test the user experience of browsing through the rate-limited tunnel, selecting 100 sites from the Umbrella 1 million; throughput is decreased by 27 times and page load time is increased by 23 times. While the low speed of the more detection-resistant tunnel may be uncomfortable for browsing, it could still be useful for low-rate applications such as bootstrapping a circumvention system.
Although it is not the main focus of the paper, the authors find that dnstt does not disguise its TLS fingerprint, which is fairly uncommon and distinctive of programs written in Go. They made a fork of dnstt that uses uTLS for TLS camouflage.
Thanks to the authors for reviewing a draft of this summary.