Some time ago I had an idea: how DPI would handle uncommonly large amount of HTTP header data, if the
Host header is in the end of all other headers? How large reassembly buffer do DPI systems usually have?
SKAT DPI developer from VAS Experts told me they have 8 KiB buffer, and I decided to check whether HTTP header padding could be used as a DPI circumvention method and how well do websites handle it.
Let’s see the size of HTTP header buffer of common web servers.
Nginx has 32KiB overall buffer, each header line can’t be longer than 8KiB.
Default: large_client_header_buffers 4 8k;
Sets the maximum
sizeof buffers used for reading large client request header. A request line cannot exceed the size of one buffer, or the 414 (Request-URI Too Large) error is returned to the client. A request header field cannot exceed the size of one buffer as well, or the 400 (Bad Request) error is returned to the client. Buffers are allocated only on demand. By default, the buffer size is equal to 8K bytes. If after the end of request processing a connection is transitioned into the keep-alive state, these buffers are released.
Httpd has 8190 bytes limit for each header and no(?) overall limit.
This directive specifies the number of bytes that will be allowed in an HTTP request header.
According to https://stackoverflow.com/a/6160643/9974656, IIS 6 has 16 KiB limit for each header.
According to https://stackoverflow.com/a/6160643/9974656, Tomcat 7 has 8190 overall limit.
Testing top 10000 sites
I decided to perform test on Alexa top 10000 websites and see which would stop working correctly with 14k and 18k padding data.
Common CURL HTTP request
Just a regular curl HTTP request without padding, but with browser’s User-Agent and proper Accept header.
curl --max-time 15 -s -I -X GET --compressed -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" "1"
8456 3xx 805 2xx 164 4xx 53 5xx Total HTTP responses: 9478
14 KB of padding data
Now curl request has 14 additional
X-Padding headers with 1KB
a's as a value.
AAAS="$(head -c 1000 < /dev/zero | tr '\0' 'a')" curl --max-time 15 -s -I -X GET --compressed \ -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0" \ -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \ -H "X-Padding01: $AAAS" \ -H "X-Padding02: $AAAS" \ -H "X-Padding03: $AAAS" \ -H "X-Padding04: $AAAS" \ -H "X-Padding05: $AAAS" \ -H "X-Padding06: $AAAS" \ -H "X-Padding07: $AAAS" \ -H "X-Padding08: $AAAS" \ -H "X-Padding09: $AAAS" \ -H "X-Padding10: $AAAS" \ -H "X-Padding11: $AAAS" \ -H "X-Padding12: $AAAS" \ -H "X-Padding13: $AAAS" \ -H "X-Padding14: $AAAS" \ "$1"
8159 3xx 732 2xx 476 4xx 66 5xx Total HTTP responses: 9433
As we can see, with 14k padding data 3.4% websites started to return either 5xx or 4xx error code compared to non-padded request.
18 KB of padding data
The request is the same, with additional lines compared to 14k request:
-H "X-Padding15: $AAAS" \ -H "X-Padding16: $AAAS" \ -H "X-Padding17: $AAAS" \ -H "X-Padding18: $AAAS" \
7029 3xx 1732 4xx 590 2xx 68 5xx Total HTTP responses: 9419
Going beyond 16KiB limit breaks a lot of websites: about 16.8% servers responded with either 4xx or 5xx error codes, compared to stock curl request.
This method could be used in some cases, either with DPI systems with less than 16 KiB reassembly buffer or with some websites/web servers with lax configuration.
The scripts I used: http-padding-test.7z (5.3 MB)
Note: this test did not move
Host header after the padding. This is relevant only for censorship circumvention and not relevant to check how much data does web server accept. This was not done on purpose, I just forgot about that.