Skip to main content

Reasoning with Async Rust

 30 minutes

This is a written version of my “Reasoning with Async Rust” talk, presented at Rustikon 2025. You can find the slides here.

Have you ever found a bug in your code that only happened one in a hundred thousand times? You were only able to reproduce it by running your code in an infinite loop. What’s more, you’ve never managed to reproduce it on your machine: it only happens in CI and even worse, may be in production.

If so, it was probably a species of concurrency bug. Concurrency bugs are hard to spot, impossible to test, and tricky to fix.

In the Rust ecosystem, we pride ourselves in type safety and testability: with these tools, we can catch bugs as early as possible, before they hit our production systems. Sadly, types and tests can’t protect us from concurrency bugs. But there are features of the Rust ecosystem that we can use: methodologies that help us design programs in a way that makes bugs easier to spot. Async programming is one such methodology.

Async programming helps us structure our code in a way that makes think about concurrency in a better way.

In this post, I will present a high level, mental picture about what async programming is and how we can think about problems with it. By the end, you’ll be experts in spotting concurrency bugs, and designing safe code. You’ll be reassured that with Async Rust, you won’t be kept awake at night worrying about concurrency bugs.

Concurrency examples

But what is concurrency, and why does it result in bugs?

We often want our programs to do multiple things “at the same time”. For example:

When we have multiple things happening at the same time, the actual outcome of our programs can be difficult to predict.

To explore this, we’re going to look at a toy example of concurrency: we’re going to cook breakfast.

Breakfast

Our breakfast has two components: eggs and bacon. To keep things even simpler, we’re going to follow a recipe.

“Crack then fry the eggs. Meanwhile, fry the bacon for five minutes, or until crispy.”

Let’s try and write this without async programming, using basic Rust.

fn breakfast() {
    cook_eggs();
    fry_bacon();
}
fn cook_eggs() {
    crack_eggs();
    fry_eggs();
}
fn crack_eggs() {
    println!("Started cracking egg.");
    random_sleep();
    println!("Finished cracking egg.");
}
fn random_sleep() { ... }
fn fry_eggs() { ... } // Prints and sleeps
fn fry_bacon() { ... } // Prints and sleeps
  • To keep things simple, we’ll simulate cooking with a random_sleep.

  • We print "Started" and "Finished" before and after cooking.

  • We have functions for each cooking process.

Here’s the output:

Started cracking egg.
Finished cracking egg.
Started frying egg.
Finished frying egg.
Started frying bacon.
Finished frying bacon.

Async / Await

Let’s introduce the async and await keywords.

async fn breakfast() {
    cook_eggs().await;
    fry_bacon().await;
}
async fn cook_eggs() {
    crack_eggs().await;
    fry_eggs().await;
}
async fn crack_eggs() {
    println!("Started cracking egg.");
    random_sleep().await;
    println!("Finished cracking egg.");
}
async fn fry_eggs() { ... }
async fn fry_bacon() { ... }
async fn random_sleep() { ... }
  • We wrap our functions with the async keyword.

  • We call await after the call to random_sleep.

  • In our breakfast and cook_eggs functions, we call await on each of the async function calls.

If you look at the output, you’ll see it’s exactly the same.

Started cracking egg.
Finished cracking egg.
Started frying egg.
Finished frying egg.
Started frying bacon.
Finished frying bacon.

What have we done, aside from making our code more verbose and possibly less efficient?

We have started to structure our code in terms of concurrency: we’ve introduced a sequential dependency of one step on another.

You can think of async as introducing a block, and await as sequentially composing blocks. Each block is composed of smaller blocks: cook_eggs is composed of crack_eggs and then fry_eggs, and each of those is composed of two println! statements (started and finished in green), surrounding a random sleep.

We can look at the blocks to predict our output. Each println! block corresponds to a statement in our console.

You may think this is obvious. Why would we want to go through the effort of walking through our code in this way? If so, wait a bit: we haven’t actual done anything concurrently yet.

Join

Meanwhile, fry the bacon.”

Let’s make a start: we want eggs and bacon to cook at the same time.

We can describe this with join!:

async fn breakfast() {
    join!(cook_eggs(), fry_bacon());
}

Looking at the output, we see that the bacon is indeed frying while the egg is being cracked and fried.

Started frying bacon.
Started cracking egg.
Finished cracking egg.
Started frying egg.
Finished frying bacon.
Finished frying egg.

Will we see this output every time we run this program?

No: we might see that the eggs are cracked first before the bacon starts to fry, or that the bacon finishes frying before the eggs.

We can draw join! in a diagram by putting the blocks side by side. This means that they can happen at the same time.

By “at the same time”, we actually mean that their individual blocks can run in any order. Have a look at the entire diagram for our breakfast function for an example:

On the right is a possible execution, obtained by ordering the blocks. In this execution, we start cracking the eggs before we start frying the bacon. The bacon finishes frying before the eggs have been cracked.

This is quite profound. When we think about concurrency, we’re not thinking about time at all. Instead, we only think about the order of execution.

We can use our diagram to predict all possible outcomes. We start by picking a block to execute from either the left or right side of the join!, then repeat the process and choose another block.

Select

This is closer to our original recipe. There’s one step we’re missing:

“Fry bacon for five minutes, or until crispy.”

We can set a timer for five minutes and write a crisp_bacon function. But how can we compose these?

If we use join!, we might end up charring the bacon: what if it crisps up before five minutes are up? We might instead end up cooking it for more than five minutes: What if it never crisps up at all?

Instead of join!, we can achieve what we want using select!. The select! macro lets us run blocks concurrently, but the entire operation terminates when the either block completes.

async fn fry_bacon() {
    println!("Started frying bacon.");
    // This is simplified
    select! {
        () = timer() => (),
        () = crisp_bacon() => ()
    };
    println!("Finished frying bacon.");
}

async fn timer() { ... }
async fn crisp_bacon() { ... }

Here is an example of the output of select.

Started frying bacon.
Started timer.
Started crisping bacon.
Finished timer.
Finished frying bacon.

This isn’t the only possible output: we can predict the rest by drawing a diagram and interleaving blocks. We can represent select as boxes drawn side by side, but use the SELECT keyword.

Here’s an expanded view of frying bacon. On the right is a possible outcome. The timer finishes before the bacon has finished crisping.

With this, we’ve fully described cooking breakfast:

The four primitives of async, await, join and select are all we need to write concurrent code.

But not only do they make concurrent code easy to write, they make it easy to reason about. When we think in terms of async programming, we think of composing many small blocks to create a large program. We can predict all possible outcomes simply by thinking about the possible orders in which those blocks can execute.

Concurrency bugs

We can use this way of thinking to find concurrency bugs. But we haven’t actually seen a bug yet, have we?

In fact, there’s no way for bugs to creep into our code so far. While there are many different orders in which eggs and bacon can be cooked, all of these are valid ways of cooking breakfast.

Concurrency bugs creep in to our program using very specific avenues, the most common of which is shared state.

To demonstrate, let’s add some shared state to our breakfast recipe.

Suppose that to cook an egg and fry bacon, you need a spoon.

“Cook the eggs using a spoon: use a spoon to crack them and then fry them. Meanwhile, use a spoon to fry the bacon.”

Sharing mutable state

Let’s assume that our kitchen is severely underequipped and there is only one spoon to use. We can try and change our function signatures accordingly to accept a &mut Spoon value.

async fn breakfast() {
  let mut spoon = find_spoon();
  join!(cook_eggs(&mut spoon), fry_bacon(&mut spoon))
}

async fn cook_eggs(spoon: &mut Spoon) { ... }
async fn fry_bacon(spoon: &mut Spoon) { ... }

The Rust compiler has our backs: we can’t share this spoon because only one function can hold a mutable reference at once, and these two functions are running at the same time.

Shared state

Instead we can use something called a mutex to restrict access to our spoon. A mutex is a construct that only allows one thing to access the spoon at once. To access it, we lock it to obtain a guard (we’ll call this spoon for convenience). To release it, we drop that guard.

As long as something has a guard, nothing else can lock the mutex. Future calls to lock will need to wait for the guard to be released.

We can clone the mutex and pass it around. As an example, let’s pass the spoon to cook_eggs and fry_bacon. One of these will lock the mutex, use the spoon and release it, then the other will do the same.

type SpoonMutex = Arc<Mutex<Spoon>>;

async fn cook_eggs(spoon_mutex: SpoonMutex) {
   let spoon = spoon_mutex.lock().await;
   ...
   drop(spoon);
}

async fn fry_bacon(spoon_mutex: SpoonMutex) { ... }
async fn breakfast() {
  let spoon_mutex = find_spoon();
  join!(cook_eggs(spoon_mutex.clone()),
        fry_bacon(spoon_mutex))
}

Looking at the output, the actual execution of frying the bacon and cooking the eggs happens one after another, but the cooking the bacon or eggs can still happen in any order.

Started cracking egg.
Finished cracking egg.
Started frying egg.
Finished frying egg.
Started frying bacon.
Finished frying bacon.

This in itself doesn’t seem so dangerous.

Deadlocks

But shared state can be dangerous. To demonstrate, let’s introduce another utensil: a pan. In order to fry eggs or fry bacon, we need to use a pan.

“Cook the eggs using a spoon: use a spoon to crack them and then fry them using a pan. Meanwhile, use a spoon and use a pan to fry the bacon.”

As with the spoon, we can introduce a pan mutex and lock it.

async fn cook_eggs(spoon_mutex: SpoonMutex,
                   pan_mutex: PanMutex) {
    let spoon = spoon_mutex.lock().await;
    crack_eggs().await;
    let pan = pan_mutex.lock().await;
    fry_eggs().await;
    drop(spoon);
    drop(pan);
}

Since we need both a spoon and a pan, we can get these locks concurrently.

async fn fry_bacon(spoon_mutex: SpoonMutex,
                   pan_mutex: PanMutex) {
    join!(spoon_mutex.lock(), pan_mutex.lock());
    ...
    drop(spoon);
    drop(pan);
}

If we run this code, we get breakfast. At least, we get breakfast most of the time. But every so often, we only get as far as cracking eggs.

Started cracking egg.
Finished cracking egg.

Can anyone spot the problem? Let’s draw what’s happening and walk through it.

Going through the blocks:

  • crack_eggs locks the spoon.

  • crack_eggs releases the spoon.

  • fry_eggs locks the spoon.

  • fry_bacon locks the pan.

At the end of this sequence, fry_bacon is waiting for the spoon, which can only be released by fry_eggs. At the same time, fry_eggs is waiting for the pan, which can only be released by fry_bacon. There’s no step that either function can take: we’re at what’s called a deadlock.

This is a classic concurrency problem. If you managed to spot it earlier, it probably means you’ve seen it before in another context.

Livelocks

How can we go about solving it?

One idea is to try locking the spoon and pan, and retry if we fail to obtain both.

let (spoon, pan) = loop {
    let maybe_spoon = try_lock_spoon(&spoon_mutex).await;
    let maybe_pan = try_lock_pan(&pan_mutex).await;
    match (maybe_spoon, maybe_pan) {
        (Some(spoon), None) => drop(spoon),
        (None, Some(pan)) => drop(pan),
        (Some(spoon), Some(pan)) => break (spoon, pan),
        (None, None) => (),
    }
};

We’ll do this in both our fry_eggs and fry_bacon functions.

async fn fry_eggs(...) {
    let (spoon, pan) = loop {...}
    ...
}

async fn fry_bacon(...) {
    let (spoon, pan) = loop {...}
    ...
}

Unfortunately, we still get the same output. This time however, our laptop heats up a lot.

Started cracking egg.
Finished cracking egg.

Let’s draw a diagram to explore what’s happening.

While it’s possible for our program to exit correctly, it’s also possible for the spoon and pan to be locked alternately by either fry_eggs or fry_bacon.

In catching one concurrency bug, we’ve unwittingly introduced another.

There’s no clear solution to it: we need to rethink our problem and design it in a different way.

Atomic state

One possible approach is to store the spoon and pan in the same mutex, and get both of them or none.

type SpoonAndPanMutex = Arc<Mutex<(Spoon, Pan)>>;
async fn cook_eggs(mutex: SpoonAndPanMutex) {
    let spoon_and_pan = mutex.lock().await;
    ...
    drop(spoon_and_pan);
}
async fn fry_bacon(mutex: SpoonAndPanMutex) {
    let spoon_and_pan = mutex.lock().await;
    ...
    drop(spoon_and_pan);
}

Actors

But there are other ways of designing programs: another solution is to decide not to share state at all. Instead of sharing state, we can write actors.

What is an actor? Image you’re a chef, but you’re in a dark on your own. There are no doors or windows, only a small channel in the ceiling from which you get your messages.

When a message comes in from this channel, you read it and dilligently cook using your single pan and spoon. Then you sit and wait for the next message.

This system of rooms and channels is a bit like pneumatic post.

Like pneumatic post tubes, channels have two ends. A channel has a sender end and a receiver end.

let (sender, receiver) = channel::<Message>(42);

The chef holds onto the receiver end of the channel.

async fn chef_actor(mut receiver: Receiver<Message>) {
    let mut spoon = Spoon; // internal mutable state
    let mut pan = Pan;
    while let Some(msg) = receiver.next().await {
       ...
    }
}

When it gets a message, it performs the cooking task using its own spoon and pan. It never hands these out.

match msg {
   // use the state
   CrackEggs => crack_eggs(&mut spoon).await,
   FryEggs => fry_eggs(&mut spoon, &mut pan).await,
   FryBacon => fry_bacon(&mut spoon, &mut pan).await,
  }
async fn crack_eggs(spoon: &mut Spoon) { .. }
async fn fry_eggs(spoon: &mut Spoon, pan: &mut Pan) { ... }
async fn fry_bacon(spoon: &mut Spoon, pan: &mut Pan) { ... }

The sending end is used by anything that wants to send instructions to the chef. In this case, we’ll code two functions: one that instructs the chef to crack and fry eggs and another that instructs it to fry bacon.

async fn send_cook_eggs(mut sender: Sender<Message>) {
    sender.send(CrackEggs).await;
    sender.send(FryEggs).await;
}

async fn send_fry_bacon(mut sender: Sender<Message>) {
    sender.send(FryBacon).await;
}

Our entire system joins all the actors together.

async fn breakfast() {
    let (sender, receiver) = channel::<Message>(42);
    join!(
        chef_actor(receiver),
        send_cook_eggs(sender.clone()),
        send_fry_bacon(sender),
    );
}

We can draw this out in a diagram and look at the combinations of sending and receiving messages.

With actor systems we can still get into deadlocks, but it can sometimes be easier to reason about messages in actor systems than it about shared state and locks.

Summary

“He fixes radios by thinking!”

— Surely You're Joking, Mr. Feynman!

We’ve come a long way: we first explored the fundementals of concurrency, and learned to think of our code in terms of composable blocks. We learned that we can order these blocks to predict the output of our code. We then used this technique to hunt down several concurrency bugs. Finally, we looked at a couple of approaches to shared state that could resolve our bugs.

Armed with async programming, we’re much better at spotting concurrency bugs than we were before.

Async programming is incredibly elegant. With just four keywords, we’ve changed how we express our programs and re-engineered how we think about them.

With async Rust, we often get bogged down by syntax, confusing error messages, or complex constructs. But at its core, async programming is really just about thinking.

I strongly encourage you to give async programming a try. When you do, take a moment to think about how your code is composed, and reason about concurrency through that lens of composition.

Learn more

Thank you

I hope you enjoyed this post. If you’d like to play around further, you can find the source code below.

Source code: github.com/zainab-ali/rustikon-2025

Questions

These questions were asked by the awesome Rustikon audience.

I’ve worked with async programming, but never encountered deadlocks or livelocks. Doesn’t this mean concurrency bugs are uncommon?

Deadlocks and livelocks are some examples of concurrency bugs, but they do tend to be uncommon. More often, your concurrency bugs are introduced by errors and retries. An unexpected order of events, when error handling is brought in, might result in things happening more than they should, and result in odd states. As an analogy, you might start with two raw eggs, but end up with three fried ones, or double the slices of bacon.

They aren’t related solely to mutexes either: any form of shared state can result in concurrency bugs.

How do actors share state?

In short, they don’t. The mutable spoon and pan are owned by the actor and not shared with any other bits of code.

The only thing that is passed from and to the actor is a message. Messages shouldn’t contain any shared state, and should instead be immutable values.

In the actor example, if we crack eggs, then fry bacon, won’t we end up with eggy bacon in our pan?

In our write up of the cooking eggs, we didn’t use a pan to crack the eggs, and only used a spoon. Assume you’re cracking the egg into a bowl, which we didn’t describe, and then removing them from the bowl when frying.

If we did want to crack eggs directly into the pan and subsequently fry them, we shouldn’t send two separate CrackEggs and FryEggs messages. This could result in eggy bacon, as described. Instead, we could send a single CookEggs message.

What happens to the eggs and bacon in the actor example?

Our breakfast example is purposefully simple and just discards the eggs and bacon. In a real problem, the eggs and bacon would be used for another purpose. A common pattern is to send them back to the originator via a oneshot channel passed with the cook message.