For months I had a nagging problem at home. I would set a local DNS override for nas.lan pointing to my Synology, and the override would stick for a day or two, then silently disappear. No errors. Nothing in the Pi-hole dashboard to indicate a change. The NAS would still answer to its IP, of course, but my laptop would get NXDOMAIN for the hostname and I would grumble and set the override again.

The setup

Pi-hole running in a docker-compose stack on a Raspberry Pi 4. The compose file was boringly normal:

services:
  pihole:
    image: pihole/pihole:2024.03
    restart: unless-stopped
    environment:
      TZ: "America/Denver"
      WEBPASSWORD: "redacted"
    volumes:
      - ./etc-pihole:/etc/pihole
      - ./etc-dnsmasq:/etc/dnsmasq.d
    ports:
      - "53:53/tcp"
      - "53:53/udp"
      - "8080:80/tcp"
    cap_add:
      - NET_ADMIN

My DNS overrides went into the web UI under “Local DNS” and ended up in /etc/pihole/custom.list. Everything was in the mounted volume, so in theory they should survive a container restart. In practice they did not.

Investigation

First I just watched the file. On a normal boot:

watch -n 2 'wc -l ./etc-pihole/custom.list'

It held at 3 lines for a while, then dropped to 0 and stayed there. That happened at around 02:40, which is not a time I am usually awake. I looked at the container logs:

docker logs pihole --since 24h | grep -iE 'custom|gravity|update'

Gravity ran at 03:30 every day. The gravity update is Pi-hole’s blocklist refresh. That was not the 02:40 event.

A more targeted search with stat:

stat ./etc-pihole/custom.list
# Modify: 2024-04-18 02:40:11

So something was rewriting this file at 02:40 exactly. I looked at systemd timers on the host:

systemctl list-timers --all
# NEXT                        UNIT                     ACTIVATES
# Thu 2024-04-18 02:40:00 MDT docker-pihole-rebuild... docker-pihole-rebuild.service

There it was. A docker-pihole-rebuild.timer I had written six months ago and forgotten about. It did docker-compose pull && docker-compose up -d. On days where the upstream image had changed, the container was being recreated, and somewhere in the init path the container was rebuilding custom.list from a different source of truth.

Why the recreate nuked the overrides

Here is the part that confused me. The volume was bind-mounted; the file should have been persistent. Reading Pi-hole’s init script in the image, on startup it writes a fresh custom.list from FTLCONF_LOCAL_DNS environment variable entries. If that env var is empty, it does not write, it just leaves the existing file alone. My env was empty, so on most days the file was preserved.

But Pi-hole 2024.03 introduced a change: if the file is older than the package install, it would rewrite a default set of comments and then keep going. Combined with a timezone detail where the container rebuild date after pull was newer than custom.list’s mtime, I got into a state where on about one in three rebuilds the script decided to rewrite. I had to read the startup script to see this:

docker run --rm --entrypoint cat pihole/pihole:2024.03 /opt/pihole/bash_functions.sh | less

Not my finest moment.

Fix

Two changes:

  1. Move local DNS entries out of custom.list and into a file under /etc/dnsmasq.d/ that is also bind-mounted but that Pi-hole does not touch:

    cat ./etc-dnsmasq/99-local.conf
    # local hostnames
    address=/nas.lan/192.168.30.12
    address=/printer.lan/192.168.30.15
    
  2. Remove the silly nightly rebuild timer. unless-stopped is enough. I update the image when I feel like it, not at 02:40.

Reflection

This is one of those “I wrote a helpful thing for myself a year ago and then forgot” stories. The override disappearing was the obvious symptom. The real bug was that I had layered two sources of truth for DNS entries (web UI and env var) and a third source of automation (the timer) that knew about neither.

If I were setting up Pi-hole fresh today I would keep all the hostnames in a dnsmasq snippet under source control, stop using the web UI for anything, and let the container be as stateless as possible. See my post on DNS-01 challenges with split-horizon DNS for why I stopped trying to have a “smart” DNS setup at home in the first place.