Paper summary: Citation Filtered: Iran's Censorship of Wikipedia (2013)

Citation Filtered: Iran’s Censorship of Wikipedia
Nima Nazeri, Collin Anderson
https://repository.upenn.edu/handle/20.500.14332/37498
https://web.archive.org/web/20181226101810/http://citationfiltered.org/
https://github.com/collina/citationreference

This report describes an experiment, done in 2013, to find out what pages of the Persian-language edition of Wikipedia were at that time blocked in Iran, and provides analysis of the results. The researchers got a complete listing of wiki page titles from https://dumps.wikimedia.org/ and simulated accessing them over HTTP from proxy servers in Iran. This was back before Wikimedia sites went all-HTTPS in 2015, when it was still possible for censors to selectively block single web pages. They sent the HTTP requests to a server of their own, not the actual Wikipedia servers. Censorship was detected when a request failed to arrive at the server, or when the client got the 403 block page characteristic of Iran.

Out of 800,000 Wikipedia URLs tested, 1,187 were found to be blocked (963 unique pages, ignoring redirects). The researchers identify two causes of blocking: generic keywords that apply to any site, not just Wikipedia; and Wikipedia-specific article blocks. 202 pages (92 unique) are blocked because of the keyword filter; 985 (871 unique) are specific page blocks. The report breaks the blocked pages into 10 categories and analyzes the most common categories.

N category
403 civil and political subjects
189 sex and sexuality
136 religious matters
93 on human rights
59 arts and culture
51 media and journalists
19 academia
7 non-sexual profane topics
3 drugs and alcohol
3 other issues

511 (53%) of blocked pages are biographies. As you might expect, pages about opposition politics are often blocked; but also even some pages about the prevailing politics, presumably because they contain some negative criticism. There is evidence of overblocking: for example the article on Bikini Atoll is blocked because its title contains the keyword بیکینی (bikini).

The report concludes that the blocking of Wikipedia articles in Iran is not justified under Article 19 of the International Covenant on Civil and Political Rights, to which Iran is a party.