About

I’m Dr. Nobel Khandaker. Zero Downtime is where I write about distributed systems, reliability engineering, and the day-to-day practice of keeping software running when the world is doing its best to knock it over.

Most posts here will be one of three kinds:

  • Failure analysis — postmortems with the names and blame stripped out, so the underlying lesson can travel.
  • Reliability patterns — backpressure, idempotency, exactly-once illusions, and where their assumptions quietly break.
  • Tooling notes — small observations on instruments I find indispensable.

Contact