I've built and operated a variety of stateful distributed systems, cutting across the consistency-availability, latency-throughput, and price-performance spectra. I have taught Rust workshops worldwide, including at Mozilla (where Rust came from), Microsoft, BMW, as well as workshops on concurrency and distributed systems. I get maniacal pleasure from invalidating my assumptions, 303/808/909s and exploring.


theoretical performance guide

A thorough overview of various timeless theoretical aspects of systems performance.

error handling in correctness-critical rust projects

A step-by-step walkthrough covering the origin of many bugs in rust projects, and how to use error types carefully to increase the chances of correctly handling expected failures.

using simulation to build jepsen-proof distributed systems

A quick overview of the simulation method - a technique for building distributed, actor, event, and message-oriented systems in a way that will quickly find race conditions, instead of waiting for them to happen in production.

Rust Programming Tipz

A collection of guidelines for effective Rust programming.

Fear and Loathing in Lock-Free Programming

An introduction to lock-free programming, with tongue-in-cheek warnings about cognitive complexity traps.

Reliable Systems Series: Model-Based Testing

An introduction to model-based testing, which applies generative testing techniques to complex systems.

Hardening Kubernetes on the DCOS with etcd-mesos

A brief article I wrote describing some of the work that went into etcd-mesos.


FOSDEM 20: sled and rio - modern database engineering with io_uring

sled is an embedded database that takes advantage of modern lock-free indexing and flash-friendly storage. rio is a pure-rust io_uring library unlocking the linux kernel's new asynchronous IO interface. This short talk will cover techniques that have been used to take advantage of modern hardware and kernels while optimizing for long term developer happiness in a complex, correctness-critical Rust codebase.

RustFest Paris 2018: Building Reliable Infrastructure in Rust

The wild success of testing tools like Jepsen is a wake-up call that we’re approaching systems engineering from a fundamentally bug-prone perspective. Why don’t we find these devastating bugs on our laptops before opening pull requests? Rust’s compiler gives us wonderful guarantees about memory safety, but as soon as we open files or sockets, all hell seems to break loose. This talk will show you how to apply techniques from the distributed systems and database worlds in a way that maximizes the number of bugs found per cpu cycle, and reduce the amount of bias that we hardcode into our tests.


Some things I have built or contributed to.


High-performance next-generation buzzword-stuffed embedded database written in Rust.


A pure-rust misuse-resistant library for using the new linux io_uring interface, which is going to completely replace epoll for networking and file IO.


A simple tool for generating flamegraphs without requiring any perl scripts or processing pipes. Additional support for Rust projects.


Fully distributed load balancer written in Erlang for helping services on Mesos communicate seamlessly with each other. Uses a CRDT-based overlay network based on Hyparview for failure detection.


A framework for managing very large MySQL systems. We created this at Tumblr to manage O(hundreds of TB) of primary business data, accessible at sub-millisecond latencies :)


A terminal-based personal organizer, mind-mapper, task tracker, and time-series visualizer written in Rust.


(work in progress) a horizontally scalable linearizable KV/object/log store written in rust, replication algorithm verified by TLA+.


(work in progress) verification of lock-free and distributed algorithms comprising a highly reliable, high performance distributed stateful system


CRDT (eventual consistency) library for distributed systems. Thoroughly tested using quickcheck.


Rust bindings for RocksDB, a highly configurable LSM embedded database. Currently used in production at several large internet services. My favorite user is TiKV, a horizontally scalable linearizable KV written in Rust, powering the mysql-compatible TiDB horizontally scalable database.


Horizontally scalable linearizable KV store based on raft, which I've performed fault injection for.


A self-healing distributed etcd supervisor, running on Mesos. Built when I was on the Kubernetes team at Mesosphere, allowing us to deploy full Kubernetes clusters on top of Mesos with the click of a mouse.


In the process of fault injecting systems I've built on top of etcd, I've found several interesting bugs in etcd itself. I've provided minor architectural guidance that was incorporated in etcd v3.


A monolithic horizontally scalable Postgres-compatible database written in Go with both snapshot and serializable snapshot configurable isolation levels. I was an early contributor, and wrote a high-performance histogram-based metric system for measuring interesting operational statistics, which I later extracted into loghisto.

event gateway

Dataflow for serverless systems, containers, and service. plug your things into each other easily. Make any service event-driven. I built a prototype and implemented the distributed bits of the initial system.


A high-performance histogram implementation for understanding the latency tail of a system, either in production or development. Does not rely on sampling methods which break in real systems work. Uses logarithmically bucketed histograms.

open source coin

Aims to be a decentralized github, bolted to a token. I built the first versions of the consensus protocol atop a simulator that rapidly teased out race conditions. Implemented in Haskell.