We had been running nftables for a couple of years without incident on the edge boxes. So when my colleague Priya pushed a rule to allow a new monitoring agent through, and two hours later all of our prometheus scrapes started timing out, I did not immediately blame the firewall.

The change looked fine:

nft insert rule inet filter input tcp dport 9100 accept

Innocent enough. Allow port 9100, node_exporter’s default. But that insert is the thing that got us. In nftables, insert puts the rule at the top of the chain, not the bottom. So whatever was before ended up after. That includes our counter rules, our established/related stanza, and crucially a jump into a chain that set connection marks we relied on downstream.

How it looked in practice

I ran nft list ruleset and diffed against the version in git:

diff <(nft list ruleset) ops/nft/edge.nft

The diff showed the new rule at position zero in the input chain. Our usual idiom is to add (which appends) or to use explicit positions with index or handle when we care. Priya’s runbook said add, but she had been in a hurry and typed insert from muscle memory. The syntax does not complain; it just silently puts the rule first.

The failure mode was subtle because port 9100 itself kept working. What broke was a later rule in the same chain that looked like this:

chain input {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    iifname "lo" accept
    jump mark-scraper-traffic
    tcp dport { 22, 80, 443 } accept
    # ... etc
}

The mark-scraper-traffic chain set ct mark on flows from our prometheus VIP, and a later policy routing rule used that mark to pin the reply path. When the new rule accepted the SYN before jump mark-scraper-traffic ran, the reply left via the default path, which had an asymmetric MTU. You can probably guess how much fun that was to diagnose.

What the counters told me

The real clue was that our accept counter on the mark-scraper-traffic jump had frozen on the affected host. I had a habit of adding counters aggressively because the overhead is negligible and the debugging payoff is huge:

counter name scraper-in { }
# ...
jump mark-scraper-traffic comment "hits counted upstream"

A quick nft list counters showed the frozen number. Combined with the nft --handle list ruleset output showing the new rule at handle position 1, it was obvious. Five minutes to confirm, twenty minutes to roll back cleanly.

The fix and the policy

Rolling back was easy:

nft delete rule inet filter input handle 42
nft add rule inet filter input tcp dport 9100 accept

The policy change was the more interesting part. We now do two things:

  1. All nftables edits go through a file in git. No interactive nft commands on prod. nft -f /etc/nftables.d/edge.nft is the only way to apply.
  2. The file has a comment block at the top that explicitly lists the order-sensitive rules with a one-liner on why order matters. Future me will thank present me.

I also wrote a small check that fails CI if the live ruleset diverges from the checked-in version by more than a counter reset. It uses nft -j list ruleset and filters out the counter fields:

nft -j list ruleset | jq 'del(.. | .packets?, .bytes?)'

That’s now in a systemd timer that posts to our alert channel if drift appears.

Reflection

The mistake was not a bug in nftables. The tool did exactly what the command said. The mistake was that our team had internalized the iptables rule of “-I inserts at the top, -A appends” but had drifted into using nftables without agreeing on a convention. When the mental model is “rules go in the order I type them”, insert feels like a synonym for add, and it is not.

If you are adopting nftables from iptables muscle memory, I would strongly recommend banning insert and delete by position in anything that runs outside your laptop. Use file-based rulesets and handles, or name your chains so that the intent is obvious. Related: see my post on testing routers with Linux network namespaces — I now test ruleset changes in a netns before they land on the edge.