Detecting Probe-resistant Proxies
Sergey Frolov, Jack Wampler, Eric Wustrow
https://censorbib.nymity.ch/#Frolov2020a
This research finds weaknesses in proxy-resistant proxy protocols, like obfs4 and Shadowsocks, that make them more prone to detection by active probing than previously thought. These are protocols that require the client to prove knowledge of a secret before using the proxy. Despite the fact that probe-resistant proxy servers are designed not to respond to unauthorized clients, they may have characteristic timeouts or disconnection behaviors that distinguish them from non-proxies.
The evaluated protocols have in common that the server reads some number of bytes from the client, then checks the authentication on those bytes. A typical code pattern is the following:
client, _ := listener.Accept()
client.SetDeadline(30 * time.Second)
buf := make([]byte, 50)
_, err = io.ReadFull(client, buf)
if err != nil {
client.Close()
return
}
if !checkAuthentication(buf) {
client.Close()
return
}
// client is authorized, server may now respond
The server reads exactly 50 bytes from the client, then checks the client’s credentials. If the client doesn’t send 50 bytes before the timeout, the server closes the connection. If the credentials are bad, the server closes the connection. Consider what happens when a unauthorized client sends 49, 50, or 51 bytes.
- With 49 bytes, the server times out after 30 seconds and closes the connection with a FIN.
- With 50 bytes, the server closes the connection immediately with a FIN (
io.ReadFull
succeeds butcheckAuthentication
fails). - With 51 bytes, the server closes the connection immediately, but with a RST, not a FIN.
Why a RST in the 51-byte case? It’s a peculiarity of Linux: if a user-space process closes a socket without draining the kernel socket buffer, the kernel sends a RST instead of a FIN. Put together, these distinctive timeout and FIN/RST thresholds form a fingerprint of the probe-resistant protocol, despite the server never sending application data.
The authors evaluate six protocols: obfs4; Lampshade (used in Lantern); Shadowsocks (the Python implementation and the Outline implementation, both with AEAD ciphers); MTProto (used in Telegram); and obfuscated SSH (used in Psiphon). They test a pool of known proxies, as well as large number of endpoints derived from a random ZMap scan (1.5 million) and from a passive ISP tap (0.4 million). They send these endpoints a selection of probes of different lengths. From these they derive simple decision trees for identifying probe-resistant proxy servers. (Where the root of the tree is always “discard endpoints that send application data in response to any probe.”)
The decision trees classify a few endpoints from the ZMap and ISP tap sets as proxies. In the case of obfuscated SSH, the authors confirmed with Psiphon developers that 7 of the 8 identified proxies actually were proxies operated by the developers. In some other cases, there is corroborating evidence that the endpoints really are proxies, even if not direct confirmation. By far the most difficult protocol to identify is MTProto, because it never times out and never closes the connection. The authors recommend this strategy for the best probe resistance: when a client fails to authenticate, just keep reading from it forever.
For the most part, the developers of the examined protocols have fixed the identified flaws, at least by continuing to read from the client and not closing the connection immediately when there’s an authentication failure. They may still have a timeout instead of reading forever, but the server will have identical reactions to the three cases examined above.
Thanks to Sergey Frolov for commenting on a draft of this summary and providing the code sample.