Bleeding Wall: A Hematologic Examination on the Great Firewall (FOCI 2024)

Bleeding Wall: A Hematologic Examination on the Great Firewall
Sakamoto, Elson Wedwards
https://censorbib.nymity.ch/#Sakamoto2024a
Presentation video
PDF

This paper discovers and investigates an out-of-bounds memory read vulnerability in the DNS injection system of the Great Firewall. When you send a DNS query for a censored hostname through the network border of China, the GFW injects a DNS response with a fake IP address. By crafting a particular kind of DNS query, it was possible to cause the DNS injector to include a small amount of its own memory in the injected response. The contents of leaked memory included network protocols (other traffic that had passed by the injector), and in a small number of cases, x86_64 Linux stack frames. The GFW has several different kinds of DNS injector; this vulnerability affected only one of them. It was fixed in October–November 2023.

The vulnerability is easy to understand. It is reminiscent of gfw-looking-glass.sh from way back in 2010. DNS names are represented as a sequence of length-prefixed labels. In a DNS message, the name example.com looks like \x07example\x03com\x00, where \x07 tells you the length of the example label, \x03 tells you the length of the com label, and \x00 marks the end of the name. The GFW’s DNS parser (here I’m only talking about the one injector that was vulnerable) did not check that label length prefixes stayed inside the bounds of the packet. Also, DNS labels are supposed to have a maximum length of 63 bytes, but the parser didn’t enforce that, instead it interpreted any value up to \xff as a label length. (For those who know DNS, this means the parser did not support name compression.) If you replaced a label length prefix in a normal DNS query with \xff, the parser would continue reading past the end of the packet, into its own memory, and include that memory in its injected DNS response.

The authors describe two payload formats that worked to recover memory from the GFW. Format 1 looks like \x03www\x06google\x03com\xff (replacing the final label length that should normally be \x00 with \xff). Format 2 looks like \xffwww.google.com, replacing the first label length with \xff, and converting all other label length bytes to literal dot characters. But just because you asked for \xff bytes doesn’t mean you would actually get that much. There was some other mechanism that limited the total size of injected responses to 158 bytes. Using a very short query hostname (e.g. 4.tt), the authors were able to leak a maximum of 124 bytes per query. Over three days in October 2023, they sent several billion queries and recovered over 1 TB of data.

Below is a sample of what a DNS response with leaked memory looks like. It’s Figure 4(a) from the paper. This one contains part of an HTTP request (some bytes have been redacted with XX). The bytes starting with c0 0c at the end are the GFW’s answer section with a fake IP address (157.240.20.8).

00 00 81 80 00 01 00 01 00 00 00 00 03 77 77 77  .............www
06 67 6f 6f 67 6c 65 03 63 6f 6d ff 2f 66 61 76  .google.com./fav
69 63 6f 6e 2e 69 63 6f 20 48 54 54 50 2f 31 2e  icon.ico HTTP/1.
31 0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a 20 XX  1..User-Agent: X
XX XX XX XX XX XX XX XX XX XX XX XX XX XX 0d 0a  XXXXXXXXXXXXXX..
48 6f 73 74 3a 20 XX XX XX XX XX XX XX XX XX XX  Host: XXXXXXXXXX
XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX  XXXXXXXXXXXXXXXX
XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX  XXXXXXXXXXXXXXXX
XX XX XX XX XX XX XX XX XX XX XX XX XX XX c0 0c  XXXXXXXXXXXXXX..
00 01 00 01 00 00 00 4d 00 04 9d f0 14 08        .......M......

The memory contains recognizable network protocols, such as HTTP and SMTP. It includes at least some amount of traffic that passes through the GFW, which the researchers demonstrated by sending their own, specially tagged traffic through the firewall, and then recovering it with DNS queries. The GFW leaking traffic to third parties, besides the obvious privacy problems, could enable off-path attacks such as cookie theft. The fact that responses can be much larger than queries makes the GFW a more effective DoS amplifier. In a very small number of leaks (fewer than 1 in 100,000), they found byte patterns consistent with Linux x64_64 stack frames.

The researchers were still running measurements when the vulnerability started to be patched. They had observation points in multiple countries, whose paths into China went through major Internet exchange points in Beijing, Guangzhou, and Shanghai. In early September 2023, all paths had the vulnerability. When they started formal measurements in late October, paths through Beijing had already been patched: only Guangzhou and Shanghai still had memory leaks. Then, as they watched, Guangzhou was patched on 2023-10-30, and Shanghai was patched on 2023-10-31 and 2023-11-01. The DNS injection system is no longer susceptible to the same kind of malformed DNS query.