I got tired of rewriting “retry this curl up to 5 times” in deploy scripts. This version is short, adds jitter (so 100 parallel CI jobs do not synchronize), logs each attempt to stderr, and preserves the command’s exit code.

#!/usr/bin/env bash
# retry [-n max] [-s initial_sleep] [-f factor] -- cmd args...
retry() {
    local max=5 sleep_s=1 factor=2
    while [[ $# -gt 0 ]]; do
        case "$1" in
            -n) max="$2";       shift 2 ;;
            -s) sleep_s="$2";   shift 2 ;;
            -f) factor="$2";    shift 2 ;;
            --) shift; break ;;
            *)  break ;;
        esac
    done

    local n=0 rc=0 jitter
    while (( n < max )); do
        "$@" && return 0
        rc=$?
        n=$((n + 1))
        if (( n >= max )); then break; fi
        jitter=$((RANDOM % (sleep_s > 1 ? sleep_s : 1)))
        echo "retry: attempt $n failed (rc=$rc), sleeping $((sleep_s + jitter))s" >&2
        sleep $((sleep_s + jitter))
        sleep_s=$((sleep_s * factor))
    done
    return "$rc"
}

# Usage:
retry -n 6 -s 2 -- curl --fail -sSL https://example.com/health

The $RANDOM jitter is milliseconds-level crude but fine for CI. For a smoother distribution, replace with awk 'BEGIN{srand(); print int(rand()*N)}'. Returns the last command’s exit code on give-up, which lets callers distinguish timeout vs 500 vs DNS failure.

See also /snippets/bash-trap-cleanup-tmpdir/.