Paper summary: Poking a Hole in the Wall: Efficient Censorship-Resistant Internet Communications by Parasitizing on WebRTC (CCS 2020

Poking a Hole in the Wall: Efficient Censorship-Resistant Internet Communications by Parasitizing on WebRTC
Diogo Barradas, Nuno Santos, Luís Rodrigues, Vítor Nunes

This paper presents a censorship circumvention design called Protozoa. Protozoa belongs to the class of systems that are use what the authors call “multimedia covert streaming”: disguising a channel to look like the transmission of an audio or video stream. Past such systems have either mimicked the surface-level features of an encrypted media stream (e.g. SkypeMorph), which gives rise to dead-parrot attacks; or they have encoded data into the audio/video signal in a way that survives media compression (e.g. CovertCast), which comes with a loss of efficiency and the challenge of matching packet size and timing features. The main innovation of Protozoa is that while it tunnels through a genuine video streaming application, it doesn’t actually exchange properly encoded video streams. Instead, it takes an input video stream (such as the webcam video) as a carrier, scoops out its encoded video bitstream, and replaces it with covert data. The recipient extracts the covert data and throws away the video stream container. This is all done without modifying the sizes or timing of video stream packets, so the traffic characteristics of Protozoa are identical to hose of the carrier video. Overall encryption of the media stream prevents an observer from seeing that any traffic replacement has happened. The design, which the authors call “encoded media tunneling,” allows for both higher performance and better resistance to traffic analysis. Encoded media tunneling in some ways resembles Slitheen, which also uses an independent carrier traffic generator and opportunistically replaces part of the traffic with covert data.

The authors build a prototype of the system using a version of Chromium that is modified to permit hooking the video transport layer and replacing the video bitstream. They do most of their testing with Whereby, a WebRTC video conferencing service. The client and proxy first share a meeting room identifier out of band. Both parties then enter the meeting room in the modified Chromium and start a meeting. Protozoa takes over the video stream and starts replacing content. Using an established service like Whereby has the advantage that most concerns about WebRTC fingerprinting do not apply: the WebRTC stack comes from a browser, and the browser automatically uses the service’s own signaling servers and STUN/TURN servers. The authors build a data set of synthetic traffic and evaluate detectability using a machine learning classifier. Protozoa-tunnelled traffic is barely more detectable than random chance, which is expected, given how it works.

What makes Whereby a suitable media streaming service is that it establishes a direct peer-to-peer WebRTC connection between the two meeting participants, and both peers know to extract data from the video stream and not treat it like actual video. Protozoa would not work with services that intercept the media stream at a middlebox and attempt to decode it, as Discord is reported to do. Reliability and retransmission of data within the (potentially unreliable) media tunnel is handled by the OS kernels at both ends, as with a VPN. The system doesn’t have any inherent defense against insider attack or proxy enumeration; as with other covert tunnels, you need to take care that the IP addresses of Protozoa proxies do not become known to censors.

Thanks to the authors for commenting on a draft of this summary.