CensorWatch: On the Implementation of Online Censorship in India
Divyank Katira, Gurshabad Grover, Kushagra Singh, Varun Bansal
https://censorbib.nymity.ch/#Katira2023a
https://cis-india.github.io/censorwatch/
CensorWatch was a research project to measure network censorship by ISPs in India using a mobile app running on the phones of volunteers throughout the country. It was the largest empirical study of censorship in India to date. The project was tailored to the Indian experience, and so was able to get better coverage of regions and sites than general-purpose measurement platforms like Censored Planet and OONI. It improved on the state of the art for India-focused studies, including Yadav et al. 2018 and Singh et al. 2020, which used single vantage points in various ISPs, and Chimayi SK 2020, which focused on the state of Manipur and used OONI Probe for network measurements. The increased coverage enabled the researchers to measure the variation of blocking across the country, both in what is blocked and how it is blocked. They uncovered evidence of ISPs blocking sites without legal authorization, which is a violation of the law. The study ran mostly from August to December 2020.
Censorship in India is decentralized: the government or courts issue blocking orders, which ISPs must then enforce. In some cases, ISPs are obligated to keep the orders secret—only certain orders that originate in the courts are a matter of public record. This means that there is no comprehensive list of what is even legally supposed to be blocked in India: testing (from multiple ISPs) is the only way to discover the effective blocklists. The authors started with an existing known blocklist, and augmented it with a list of 4,000 sites leaked by a whistleblower in 2020. The CensorWatch test list consisted of 10,372 sites. 7,336 were online and reliable enough for comparative analysis, and of these, 6,787 had at least one instance of blocking.
The CensorWatch app did standard HTTP, DNS, and TLS SNI tests, comparing network responses with those obtained from an uncensored control server. HTTP filtering is the most common method of censorship, despite its low effectiveness in the contemporary Internet, being seen in 64/71 ASes tested. (Cf. Master and Garman 2023 §4.2.) SNI filtering was seen in 16/64 ASes tested, and DNS-based blocking (either false responses from an ISP name server or middlebox injection) in 10/64. SNI and DNS tend to be used by larger ISPs. The researchers even found instances of transit censorship, where a user in one ISP sees another ISP’s block page, because of how traffic is routed. The app did not test for IP address or TCP port blocking, though it is now known there have been at least a few orders to block IP addresses.
Users were asked to self-report their region in the app. The researchers also tried IP geolocation using ipinfo.io, but as the geolocation disagreed with the self-reported location in 98/331 tests, they doubted its accuracy and ended up using only the self-reported locations.
There are substantial disparities in what sites are blocked in different ISPs. Appendix A shows that most ISPs block around 5,000 to 7,000 of the tested sites, though some block many fewer. Section 6 exhibits cases of ISPs blocking without legal justification: one site that was supposed to be blocked for only 16 days remained blocked long after; one was specifically ordered unblocked by a court and not all ISPs unblocked it; and a small number ISPs apparently deciding on their own to block YouTube, Telegram, or GitHub.