← Writing
System Design· 1 min read

Designing Event-Driven Backends That Don't Bite Back

Event-driven architecture is powerful, but the trade-offs only surface under load. A field guide to consistency, idempotency, and operational sanity.

Event-driven systems feel elegant on a whiteboard and ruthless in production. The moment you stop reasoning about ordering, idempotency, and failure, the system starts quietly lying to you. Here's how I design them to stay honest.

Asynchronous does not mean eventual-consistent-everywhere

Not every workflow needs events, and not every event needs to be asynchronous. The instinct to "make everything event-driven" usually produces a distributed system that nobody can debug. Be deliberate about where the boundary is.

Events are a commitment to eventual consistency. Only pay that cost where the decoupling is worth it.

Idempotency is the real contract

Consumers must be safe to retry. Always. An event will be delivered more than once — design as though that is guaranteed, because effectively it is.

  1. Give every event a stable, unique identifier.
  2. Make consumer side-effects idempotent on that identifier.
  3. Track processed IDs in a durable store.

Ordering: define the contract, then defend it

Most systems need ordering within an entity, not globally. Partition your stream by entity key and you get per-key ordering without paying for global ordering — which you almost never actually need.

Observability for the gaps

The hardest bugs live in the space between producers and consumers. Instrument that boundary explicitly: emit metrics for produced-vs-consumed lag, and alert on dead-letter queue depth before it becomes an incident.

When to walk away

If you find yourself building a saga with eight compensating transactions to undo what a single transactional database operation would have handled, step back. Events are a tool, not a religion. Use them where the decoupling earns its complexity, and nowhere else.