kebab-case: Infinity in the palm of your hand- the pull model

Notional machines

The technical term for such an intuition is a notional machine.

A notional machine is a model that lets us reason about a piece of code by giving us a straightforward set of steps to use to break it down. It’s a model, not the real thing — in fact the steps can be very different to how the underlying code actually executes.

The pull model

With that in mind, we’re going to build our own notional machine for the fs2.Stream. For want of a better name, let’s term it the pull model. The pull model will help us reason, both in the finite and the infinite, and give us a much better grasp of how streams compose.

But first, we need to think of streams in a different way, aside from as infinite lists.

A stream is a program. Each operator — take, drop or repeat — is a series of steps in that program.

Like all programs, a stream doesn’t do anything unless it is run. In order to run the program, we compile it by calling any of the compile operations. You can think of compiling as converting a complete stream program into a running Scala program.

Let’s take compile.last as an example. compile.last requests elements of the stream one by one until the stream finishes. In fs2 terms, “requesting an element” is known as pulling on the stream.

The steps

Now that we can think of streams as programs, we can think about the steps involved in running them.

To figure out the steps in an entire stream, we need to look at each operator (take or drop) individually.

An operator can do a limited set of things:

It can pull on the operator upstream.
If it does so, it will either receive an element outputted from upstream, or be told that the upstream is done.
It can output an element back downstream.
It can indicate that it is done.

An example

There’s was a lot to unpack here: what’s meant by pulling and outputting, or upstream and downstream for that matter? Let’s walk through an example.

Stream("Mao", "Popcorn")
  .repeat
  .take(3)
  .compile
  .last

This stream is a program. It’s run by the call to compile.last, which will pull on the stream until it finishes.

When a stream is run, the last operator is always pulled on first.

The last operator here is take(3). Let’s have a look at it.

take

We can describe take(3) in terms of pulls and outputs.

take(3) is first pulled on by its downstream, compile.last. When that happens:

It pulls on its upstream. In this case, that’s Stream("Mao", "Popcorn").repeat.
It receives the element "Mao" and outputs it downstream.

take(3) keeps track of the number of elements already pulled. On the first three pulls, it pulls on its upstream to receive Mao, Popcorn and Mao respectively. But on the fourth pull, it doesn’t bother with upstream. Instead, it indicates that it’s done.

Pull diagrams

The take(3) operator involves multiple pulls and outputs, and a couple of different scenarios: that’s a lot of information to absorb. A picture is worth a thousand words, and thankfully we can express these steps beautifully in a diagram.

Here’s how we might draw out those steps:

Each box on the left is a single operator. There are three boxes: the upstream, the take(3) operator, and the downstream respectively.

We start reading the diagram from the bottom left corner and work our way to the right, following the arrows:

An upwards arrow corresponds to a pull.
A downwards arrow corresponds to an output.
A green tick corresponds to an operator being done.

This is only one case of take(3), hooked up to an infinite upstream. What if the upstream was empty?

We can describe that in a diagram too:

When an empty upstream is pulled on, it indicates that it’s done. take(3) propagates this downstream.

Repeat

The next upstream operator is repeat. We can describe it in step form too. When pulled on:

It pulls on its upstream.
If its upstream outputs a value, it outputs that downstream.
If its upstream is done, things get interesting: the upstream is “repeated” by being re-instantiated. It pulls on the new upstream to receive the first value, then outputs that value.

We can draw these two cases in diagrams. First, the case when upstream outputs a value:

Then the interesting case, when upstream is done:

The stream apply

Finally, let’s walk through Stream("Mao", "Popcorn"), the top operator in our stream, and arguably the most simple.

The first time it’s pulled on, it outputs "Mao".
The second time it’s pulled on, it outputs "Popcorn".
The third time it’s pulled on, it indicates it is done.

Composition

We now know the steps involved in each operator — take(3), repeat and Stream("Mao", "Popcorn"). But what are the steps for our entire stream?

The great thing about these steps is that they compose: when an operator pulls on its upstream, we execute the series of steps for that upstream operator. And if the upstream operator pulls on its upstream, we execute that upstream’s steps.

We can see this clearly in our pull diagrams: composing operators is as simple as matching up their pulls and outputs:

Our diagram speaks for itself: not only can we predict the output of our program, we can describe how it’s run.

An intuition for infinity

Armed with the pull model, we can now reason about infinite streams.

We can decompose a stream into its constituent operators, and decompose those operators into pull and output steps.

By using pull diagrams, we can quite literally paint a picture of how a stream works, and use that picture to walk through a stream’s evaluation.

Viewed in this way, an infinite stream is actually an infinitely recurring process: a series of steps that never stops repeating. We can turn infinite streams into finite ones and back simply by composing operators: adding an operator changes the steps executed.

Infinity and beyond

We’ve an intuition for infinity, but the pull model actually gives us much more. We’ve also gained a tool for reasoning about all other operators in the fs2 library.

We can see how any new operator slots into our existing picture by examining its steps. And if we’re trying to hunt for an operator for a specific task, we can easily describe the steps that it involves and search from there.