💾 Archived View for tilde.pink › ~kaction › log › 2022-01-22.1.gmi captured on 2023-09-28 at 16:12:46. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
For long time I have been using large /etc/hosts file to block ads and trackers, courtesy of "hblock" project. While I it is obvious that reading 16Mb file on every invocation of network-related program is sub-optimal, it worked well enough so it was never a priority for me.
https://github.com/hectorm/hblock
Recently I started to fool around with Go language and stumbled upon #50511 -- parsing of /etc/hosts is so inefficient that it prevented me from writing anything network-related in Go. So, bug in Go language nudged me to do right thing with my DNS setup.
https://github.com/golang/go/issues/50511
Right thing is to replace /etc/hosts with DSN cache that would have its own hostname to IP resolution database. Probably it violates some standard, but it listens on 127.0.0.1, after all.
Unfortunately, dnscache(8) from djbdns suite did not support this functionality, but thanks to simplicity and good design of djbdns, it took me just an hour to develop the patch. With this patch, dnscache(1) looks up hostname in $ROOT/hosts.cdb before doing actual DNS network stuff. Database keys are hostnames in binary format of DNS protocol, as illustrated in following diagram for "google.com" domain; values are just IP addresses (4 bytes exactly).
6 letters 3 letters ------------------------- ------------- | \6 | g | o | o | g | l | e | \3 | c | o | m | \0 --------------------------------------------------- 12 bytes key
You can download both patch and utility program that generates constant database from /etc/hosts here:
../static/patches/djbdns-hosts.patch
../static/utils/djbdns-hosts.c
Caveat. This patch looks hostname up in case-sensitive way, while for some insane reasons DNS standard says that hostnames are case-insensitive. It means that if your "hosts.cdb" resolves "googleanalytics.com" into 0.0.0.0, enemy can still sneak if disguised as "GoogleAnaLytics.com". Fixing it would require one more pair of malloc/free per request, so I won't do until I see it in the wild.
With this setup, I can run DNS queries starting from root servers instead of using cache of ISP or google (8.8.8.8) or CloudFlare (1.1.1.1), which is slightly more privacy-friendly. Sure, nothing is private in clear-text DNS, but for ISP analyzing transit traffic is more involved than logging requests directed to their servers.