Notional machines
The technical term for such an intuition is a notional machine.
A notional machine is a model that lets us reason about a piece of code by giving us a straightforward set of steps to use to break it down. It’s a model, not the real thing — in fact the steps can be very different to how the underlying code actually executes.
The pull model
With that in mind, we’re going to build our own notional machine
for the fs2.Stream
. For want of a better name, let’s
term it the pull model. The pull model will help us reason, both in the
finite and the infinite, and give us a much better grasp of how
streams compose.
But first, we need to think of streams in a different way, aside from as infinite lists.
A stream is a program. Each operator —
take
, drop
or repeat
— is a
series of steps in that program.
Like all programs, a stream doesn’t do anything unless it is
run. In order to run the program, we compile it by calling
any of the compile
operations. You can think of
compiling as converting a complete stream program into a running Scala
program.
Let’s take compile.last
as an example. compile.last
requests elements of the
stream one by one until the stream finishes. In fs2
terms, “requesting
an element” is known as pulling on the stream.
The steps
Now that we can think of streams as programs, we can think about the steps involved in running them.
To figure out the steps in an entire stream, we need to look at each operator
(take
or drop
) individually.
An operator can do a limited set of things:
It can pull on the operator upstream.
If it does so, it will either receive an element outputted from upstream, or be told that the upstream is done.
It can output an element back downstream.
It can indicate that it is done.
An example
There’s was a lot to unpack here: what’s meant by pulling and outputting, or upstream and downstream for that matter? Let’s walk through an example.
Stream("Mao", "Popcorn") .repeat .take(3) .compile .last
This stream is a program. It’s run by the call to
compile.last
, which will pull on the stream until it
finishes.
When a stream is run, the last operator is always pulled on first.
The last operator here is take(3)
. Let’s have a look at
it.
take
We can describe take(3)
in terms of pulls and
outputs.
take(3)
is first pulled on by its downstream, compile.last
. When that happens:
It pulls on its upstream. In this case, that’s
Stream("Mao", "Popcorn").repeat
.It receives the element
"Mao"
and outputs it downstream.
take(3)
keeps track of the number of elements already pulled. On the first
three pulls, it pulls on its upstream to receive Mao
,
Popcorn
and Mao
respectively. But on the
fourth pull, it doesn’t bother with upstream. Instead, it indicates
that it’s done.
Pull diagrams
The take(3)
operator involves multiple pulls
and outputs, and a couple of different scenarios: that’s a lot of
information to absorb. A picture is worth a
thousand words, and thankfully we can express these steps beautifully in
a diagram.
Here’s how we might draw out those steps:
Each box on the left is a single operator. There are three boxes:
the upstream, the take(3)
operator, and the downstream respectively.
We start reading the diagram from the bottom left corner and work our way to the right, following the arrows:
An upwards arrow corresponds to a pull.
A downwards arrow corresponds to an output.
A green tick corresponds to an operator being done.
This is only one case of take(3)
, hooked up to an
infinite upstream. What if the
upstream was empty?
We can describe that in a diagram too:
When an empty upstream is pulled on, it indicates that it’s
done. take(3)
propagates this downstream.
Repeat
The next upstream operator is repeat
. We can describe
it in step form too. When pulled on:
It pulls on its upstream.
If its upstream outputs a value, it outputs that downstream.
If its upstream is done, things get interesting: the upstream is “repeated” by being re-instantiated. It pulls on the new upstream to receive the first value, then outputs that value.
We can draw these two cases in diagrams. First, the case when upstream outputs a value:
Then the interesting case, when upstream is done:
The stream apply
Finally, let’s walk through Stream("Mao", "Popcorn")
,
the top operator in our stream, and arguably the most simple.
The first time it’s pulled on, it outputs
"Mao"
.The second time it’s pulled on, it outputs
"Popcorn"
.The third time it’s pulled on, it indicates it is done.
Composition
We now know the steps involved in each operator —
take(3)
, repeat
and
Stream("Mao", "Popcorn")
. But what are the steps for our
entire stream?
The great thing about these steps is that they compose: when an operator pulls on its upstream, we execute the series of steps for that upstream operator. And if the upstream operator pulls on its upstream, we execute that upstream’s steps.
We can see this clearly in our pull diagrams: composing operators is as simple as matching up their pulls and outputs:
Our diagram speaks for itself: not only can we predict the output of our program, we can describe how it’s run.
An intuition for infinity
Armed with the pull model, we can now reason about infinite streams.
We can decompose a stream into its constituent operators, and decompose those operators into pull and output steps.
By using pull diagrams, we can quite literally paint a picture of how a stream works, and use that picture to walk through a stream’s evaluation.
Viewed in this way, an infinite stream is actually an infinitely recurring process: a series of steps that never stops repeating. We can turn infinite streams into finite ones and back simply by composing operators: adding an operator changes the steps executed.
Infinity and beyond
We’ve an intuition for infinity, but the pull model actually gives us much more. We’ve also gained a tool for reasoning about all other operators in the fs2 library.
We can see how any new operator slots into our existing picture by examining its steps. And if we’re trying to hunt for an operator for a specific task, we can easily describe the steps that it involves and search from there.