Idea: DNSBL-alike DNS server to use with PAC file

@ilyaigpetrov, @darkk and I are thinking about DNSBL-alike server, to use it as a censorship list storage, which returns predefined DNS reply if the domain or IP address is blocked and another reply if not.
It could be used from Proxy Auto-Configuration file using dnsResolve function.

Example:
A query for blocked.com.anticensorshipdnsbl.com returns 192.0.2.1,
A query for notblocked.com.anticensorshipdnsbl.com returns 192.0.2.2.

Side note: while using private ranges like 192.168.0.0/16 or 10.0.0.0/8 or localhost 127.0.0.0/8 may seem to be more appropriate, some DNS resolvers could filter responses with these ranges. For example, such filter is available in dnsmasq and could be enabled in OpenWRT.

This method:

  • Does not require additional software on client
  • Allows instant list updates without redownloading it on client
  • Requires a very small stub PAC file
  • Could be combined with other PAC file content, to decrease DNSBL load
  • Could be used to circumvent 1 MB PAC file size limit and various memory limits in browsers
  • Harder to block due to its unusual nature (uses only DNS), likely to work even if blocked by changing DNS server
  • Could be cached on recursive resolvers for a predefined time period.

Cons:

  • DNSBL server must be stable, fast and always available. Unavailable DNSBL server which does not reply will result in several second freezes in browser at least, totally broken web at most.
  • DNSBL server load will be high if implemented without any filtering. Every domain, blocked and not blocked, would be sent to the server.
  • DNS cache is not as effective as many may think, especially for lots and lots random “subdomains” of a single domain.

The story behind me raising this discussion is the folowing one: I’m facing more and more minor websites being blocked by IP addresses blacklisted due to ongoing attempts to ban telegram (at least the IP addresses are attributed with that court decision). Currently the number of IP addresses banned one-by-one is 1.8M (+150k domains, +several subnets).

I was under impression that the PAC file size is limited by 1MiB in the modern browsers and that was the reason for @valdikss to strip of the attempt of telegramocide from the antizapret PAC.

My requirements for a circumvention tool are:

  • to use the PAC file to configure browser and nothing but the browser for the circumvention and leave OS network configuration intact
  • to route the minimal required amount of traffic through the proxy due to performance reasons

So my idea was to fill the PAC file with a bloom-filter (or xor-filter) to prevent a separate blocking DNS query for each and every request and bring the “ground truth” knowledge to the browser via the means of DNSBL responding to ${domain}.${ip}.blocklist.rkngov.рф.

Probably, the pre-filter should only be filled with IP addressess responding to 80/tcp and/or 443/tcp and should only include the domains those are alive and responding to http/https queries. But that’s a matter of zgrab/zmap, so that’s trivial.

Yet, I’m still unsure if that’s a useful approach given that @ilyaigpetrov has found a reasonable way to circumvent 1MiB limitation of Chrome.

On the other hand, Firefox plugins can’t update the PAC file (per @ilyaigpetrov words) and dnsbl may be theoretically useful for this case. We still have to update the pre-filter one way or another, and it’s unclear to me what is the practical periodicity of FF updates.

Chrome on Windows uses wininet (=IE) proxy configuration settings with 1 MB size limit, and this limit could be circumvented only for Chrome itself, with an extension. You can’t configure >1 MB PAC file system-wide, since both Windows and Chrome follow 1 MB file size limit.

Firefox, as I recall, doesn’t have file size limit (at least not 1 MB), but has tight dynamic memory limits: I had to optimize AntiZapret PAC file to make it work with older 32 bit ESR versions for (crazy) people with Windows XP.
Firefox takes HTTP Expires header into account, properly caches PAC file and updates it only when necessary.

Probably, the pre-filter should only be filled with IP addressess responding to 80/tcp and/or 443/tcp

I’ve done a quick zmap. Out of 1_844_367 IP addressess blacklisted with blockType=“ip”

  • 150_345 respond to tcp/80
  • 165_015 respond to tcp/443
  • 88_899 respond both to tcp/80 and to tcp/443
  • 226_461 respond at least to one of tcp/80 or tcp/443

Given that a website has to listen at tcp/80 to be reachable without HSTS preload, it’s safe to assume that only 150k IPs are reasonable to add to the PAC file.
That’s still a lot compared to 3k of selected IP addressess of antizapret PAC.