Decentralized Control: A Case Study of Russia
Reethika Ramesh, Ram Sundara Raman, Matthew Bernhard, Victor Ongkowijaya, Leonid Evdokimov, Anne Edmundson, Steven Sprecher, Muhammad Ikram, Roya Ensafi
This paper is an in-depth of network censorship in Russia. They characterize the censorship using a combination of direct and remote measurements, testing the entries of a government blocklist to find out where and how they are blocked. The overarching observation is that the Russian government has built an effective information-controls system, despite a network structure that is less centralized than that of China or Iran. The authors fear that Russia may serve as a model for other governments that want to censor but do not have tight control over their network infrastructure. Decentralization poses challenges both for censorship measurement (a single vantage point is not enough) and for circumvention (what works for one person may not work for another).
The censorship model in Russia is one of centralized policy and distributed implementation. The federal service Roskomnadzor maintains the list of what should be blocked, but ISPs are independently responsible for enforcing the blocks, using technical means of their choosing. The authors of the paper obtained several leaked, digitally signed copies of the Roskomnadzor blocklist, consisting of IP addresses, IP subnets, domain names, and domain wildcards. They used the most recent leaked blocklist, dated April 24, 2019, as the input list for all of their active measurements, prefiltered to remove non-responsive destinations (about 25% of the total).
The authors worked with local activists to get access to direct measurement vantage points: 6 data centers VPSes and 14 residential Internet connections. They complemented these with remote measurement vantages: 718 Quack (using echo servers) and 357 Satellite (using open DNS resolvers). From these measurement locations they scanned the entries on the blocklist, finding diversity not only in how much is blocked but in how the blocking is done. Most locations blocked over 90% of the blocklist, but some residential locations blocked only about half, and one data center location blocked almost none (Figure 3). Blocking is implemented in many different ways: IP/port blocking, TCP reset, DNS filtering; sometimes with a blockpage and sometimes without. Residential networks are more likely to show a blockpage, and are overall more highly censored than data center networks.
They characterize the entries of the blocklist. In the style of McDonald et al., they find that about half of blocked domains are hosted on a CDN, and almost all of those are on Cloudflare (Table III). Using the techniques of Weinberg et al., they find that the two most common languages of blocked sites are Russian and English, and the two most common topic categories are gambling and pornography (Table IV). The well-known zapret-info repository has maintained a detailed timeline of changes to the government blocklist since 2012. The authors of the paper found it to be in almost complete agreement with their own leaked samples. Analyzing trends over time shows that the number of blocklist entries has accelerated in the last year, with fewer duplicate entries.
Thanks to Reethika Ramesh for comments on a draft of this summary.