Trait objects vs generics: when I finally picked a side

In Rust, you have two ways to abstract over behavior: generics (monomorphized, zero-cost, no dynamic dispatch) or trait objects (dynamically dispatched, one implementation shared by all callers, slightly slower). Everyone tells you “use generics by default,” but there’s a real cost to generics that the “zero-cost abstraction” slogan glosses over. I’ve bounced between the two for a while. Here’s the rule I’ve settled on.

First, a quick reminder of the mechanics. Generics:

fn process<T: Handler>(h: T, msg: Message) -> Result<Response, Error> {
    h.handle(msg)
}

The compiler generates a separate copy of process for every concrete T you use — process::<FooHandler>, process::<BarHandler>, etc. Each call is a direct function call.

Trait objects:

fn process(h: &dyn Handler, msg: Message) -> Result<Response, Error> {
    h.handle(msg)
}

One copy of process. The call to h.handle goes through a vtable — a pointer indirection per method call. The trait object &dyn Handler is a fat pointer: one pointer to the data, one to the vtable.

The standard advice is “generics are faster, use them.” This is true on a per-call basis — a vtable lookup is a few nanoseconds, a direct call is roughly zero. But the cost of generics is hidden: code size. If you have a generic function that’s 500 bytes of code, and you use it with 20 different types, that’s 10KB of code in your binary. If that function is then #[inline]d into callers, that code repeats at every call site. I’ve seen release binaries bloat from 40MB to 110MB because someone went generic-happy on an HTTP handler framework.

The binary size issue isn’t just aesthetic. Larger binaries have worse instruction cache behavior. A monomorphized hot loop that doesn’t fit in L1i runs worse than a trait-object hot loop that does.

My rule, after a year of going back and forth:

Use generics when:

The function is small (inlined into callers anyway)
The concrete type matters for optimization (e.g., iterator adapters, where the compiler unrolls loops)
The number of concrete instantiations is low (maybe 3-4)
You want the caller to see the concrete type (for trait method visibility, for Sized reasons, etc.)

Use trait objects when:

The function is large
The number of concrete instantiations is high or unbounded (plugins, user-provided handlers)
You want to store a heterogeneous collection (Vec<Box<dyn Trait>>)
You don’t need the extreme performance (the call itself isn’t hot)
The trait is object-safe (no generic methods, no Self in return types)

Here’s an example of when I switched. We had a handler registry:

pub trait Handler {
    fn handle(&self, msg: Message) -> Result<Response, Error>;
}

pub struct Registry<H: Handler> {
    handlers: HashMap<String, H>,
}

This forced every user of Registry to pick a single handler type. That wasn’t what they wanted — they wanted to register different kinds of handlers under different routes. I changed it to:

pub struct Registry {
    handlers: HashMap<String, Box<dyn Handler + Send + Sync>>,
}

impl Registry {
    pub fn register(&mut self, name: &str, handler: Box<dyn Handler + Send + Sync>) {
        self.handlers.insert(name.to_string(), handler);
    }

    pub fn dispatch(&self, name: &str, msg: Message) -> Result<Response, Error> {
        self.handlers
            .get(name)
            .ok_or(Error::Unknown)?
            .handle(msg)
    }
}

This is now much more useful. The vtable dispatch on handle is trivial compared to the cost of even a simple handler’s work. And the binary is smaller because dispatch isn’t monomorphized.

Going the other way, I once had:

pub fn process(iter: &mut dyn Iterator<Item = u64>) -> u64 {
    iter.sum()
}

This worked but benchmarks showed it was about 3x slower than the generic version:

pub fn process<I: Iterator<Item = u64>>(iter: I) -> u64 {
    iter.sum()
}

In the generic version, the compiler inlines the iterator’s next() call at every iteration, and can often optimize the whole loop into SIMD. In the trait-object version, next() is a vtable call per element, which kills any hope of SIMD. For iterators and other small, called-in-tight-loops traits, generics are the right answer.

A specific footgun: async fn in traits. For a long time, async traits had to be trait-objectified via async_trait crate (which uses Box<dyn Future> returns, with allocation overhead per call). As of Rust 1.75 you can write async fn directly in traits, but you hit the issue that the returned future type is tied to the impl, which makes dyn Trait not trivially work for async traits. There are workarounds (dyn AsyncTrait via the trait-variant crate, or return-position impl Trait in traits), but it’s still evolving. I’ve ended up using async_trait for a while longer even though it allocates, because the ergonomics are proven and the allocation cost is tiny relative to what async handlers typically do.

A final note: impl Trait as an argument type — fn process(h: impl Handler) — is syntactic sugar for generics. It’s NOT a trait object. If you want dyn dispatch, you have to write &dyn explicitly. I’ve watched a colleague switch a function from impl Trait to dyn Trait and get confused why their benchmarks suddenly showed a performance regression — they’d moved from generic to dyn without realizing. The Rust syntax for this is a little subtle.

A year on, I don’t think there’s a default answer. “Use generics by default” is good advice for small utility code. For framework code, trait objects are often a better fit. The right question is always “how hot is this call, and how much do I care about binary size?” Pick the tool that matches.