Paper summary: Triplet Censors: Demystifying Great Firewall's DNS Censorship Behavior (FOCI 2020)

Triplet Censors: Demystifying Great Firewall’s DNS Censorship Behavior
Anonymous, Arian Akhavan Niaki, Nguyen Phong Hoang, Phillipa Gill, Amir Houmansadr
https://censorbib.nymity.ch/#Anonymous2020a
https://www.usenix.org/conference/foci20/presentation/anonymous (video and slides)
https://gfw.report/publications/foci20_dns/en/ (code and data)

The paper is a study of DNS injection by the GFW. While there have been many similar studies, this one goes further in its methodology, finds interesting new behavior, and explains phenomena that past work could not. The most striking observations are that different groups of domains are poisoned by different subsets of the poison IP address pool; and that there are (at least) three different DNS injectors, each with its own network fingerprint, responding to distinct but overlapping subsets of domains.

The primary experiment is nine months of querying a million domains every two hours. The queries are sent from outside China to a controlled VPS inside China, taking advantage of the bidirectionality of DNS injection. The overall trend is for more domains to be blocked over time, increasing from 24000 in September 2019 to 24600 in May 2020. Examining the hour-by-hour changes reveals certain implementation details, for example an evident pattern change from *youtube.com to the more specific *.youtube.com resulted in the sudden unblocking of about 50 domains. Before 2019-11-23, DNS injections drew from a pool of 1510 phony IP addresses; but on that date, the size of the pool suddenly shrank to 216. Curiously, the selection of a phony IP address is not uniform for every injection; each domain draws from only a subset of the total pool. Domains may be organized into groups, according to which subset of phony IP addresses they use. Group 4, for example, consists of 33 Google-related domains, each of which is poisoned by a subset of only four IP addresses. The IP addresses making up the total pool are not random—most of them belong to US-based organizations like Facebook, Dropbox, and Twitter, though most do not point to a live host.

There is more than one DNS injector. The authors provide robust network fingerprints for three, using features such as the flags in the IP and DNS headers, and trends in the IP ID and IP TTL fields. The injectors handle different (but overlapping) subsets of domains and draw from different (but overlapping) IP address pools, corresponding to the domain groups mentioned earlier. Injector 1 handles the fewest domains, but the most popular. Injector 3’s domains are a subset of Injector 2’s. Injector 1 uses incrementing TTLs and Injector 2’s TTLs are random, but Injector 3 does something weird: it reflects the TTL of the query in the response, meaning that the original TTL must be at least twice the distance to the injector for the injected response to make it back to the sender. Taking this quirk of TTL into account, all three injectors lie at the same hop distance away from the probe host, and timing measurements are consistent with all three being co-located.

The authors then do a separate, one-time, multi-path experiment, querying a single blocked domain name against a random IP address in virtually every network prefix announced in China, 36146 addresses and 417 ASes in total. 92% of prefixes are affected by at least one injector, and 62% are affected by all three. 4% are affected by yet a fourth injector, whose fingerprint does not match that of the other three.

Thanks to the authors for reviewing a draft of this summary.