Shell retry with exponential backoff and jitter
I got tired of rewriting “retry this curl up to 5 times” in deploy scripts. This version is short, adds jitter (so 100 parallel CI jobs do not synchronize), logs each attempt to stderr, and preserves the command’s exit code.
#!/usr/bin/env bash
# retry [-n max] [-s initial_sleep] [-f factor] -- cmd args...
retry() {
local max=5 sleep_s=1 factor=2
while [[ $# -gt 0 ]]; do
case "$1" in
-n) max="$2"; shift 2 ;;
-s) sleep_s="$2"; shift 2 ;;
-f) factor="$2"; shift 2 ;;
--) shift; break ;;
*) break ;;
esac
done
local n=0 rc=0 jitter
while (( n < max )); do
"$@" && return 0
rc=$?
n=$((n + 1))
if (( n >= max )); then break; fi
jitter=$((RANDOM % (sleep_s > 1 ? sleep_s : 1)))
echo "retry: attempt $n failed (rc=$rc), sleeping $((sleep_s + jitter))s" >&2
sleep $((sleep_s + jitter))
sleep_s=$((sleep_s * factor))
done
return "$rc"
}
# Usage:
retry -n 6 -s 2 -- curl --fail -sSL https://example.com/health
The $RANDOM jitter is milliseconds-level crude but fine for CI. For a smoother distribution, replace with awk 'BEGIN{srand(); print int(rand()*N)}'. Returns the last command’s exit code on give-up, which lets callers distinguish timeout vs 500 vs DNS failure.
See also /snippets/bash-trap-cleanup-tmpdir/.