Blocking and computing
The snooze
task didn’t do any calculating — its only
operation was a Thread.sleep
— so its
thread didn’t occupy a processor in order for the task to progress. As
long as we had an unbounded thread pool, an unlimited
number of snooze
tasks could run at once, each on their
own thread. These sorts of tasks are known as “blocking”: the
application sits and waits for them to complete, but doesn’t actually
do any calculations. Blocking tasks are rare, and you should be very
reluctant to write one.
The factorial
task, on the other hand, did a lot of
multiplication. Each task occupied one of my eight processors as it ran, so
only eight of those tasks could run at the same time. These sorts of tasks are
termed “compute-intensive”. While factorial
doesn’t
resemble a typical Scala application, it’s more similar to one than
snooze
.
Varying it up
This difference between blocking and compute-intensive tasks poses
a problem for us: what if we want to run a load of factorial
and snooze
tasks at the same time?
val snoozeAndCompute: IO[Unit] = List(tenFactorials, tenSnoozes).parSequence.void
Which runtime should we choose?
If we use the basicRuntime
, each task will be given
its own thread. This is good for the blocking snooze
task, but bad for factorial
. But if we use a
boundedRuntime
our snooze
task will block
a thread that factorial
could use to progress.
time(snoozeAndCompute).unsafeRunSync()(boundedRuntime(numProcessors)) // res12: String = "The task took 6 seconds."
As expected, using a boundedRuntime
isn’t ideal.
How can we give the blocking snooze
task unlimited scaling, but
bound the factorial
task at eight threads?
Thankfully, there’s a way to get the best of both worlds. Instead of having just one thread pool, we could have two: an unbounded thread pool for blocking tasks and a bounded one for compute tasks.
It turns out that cats-effect 3 IORuntime
supports
this exact use case. Let’s take a closer look at the setup code for
the boundedRuntime
to see how. Here’s a simplified version:
def boundedRuntime(numThreads: Int): IORuntime = IORuntime( compute = IORuntime.createDefaultComputeThreadPool(numThreads), blocking = IORuntime.createDefaultBlockingExecutionContext() )
The IORuntime
accepts two thread pool arguments:
compute
and blocking
. It uses these
thread pools for the compute-intensive and blocking operations
respectively.
We can access the compute thread pool using the
compute
field. This gives us an ExecutionContext
:
boundedRuntime(numProcessors).compute // res13: ExecutionContext = cats.effect.unsafe.WorkStealingThreadPool@4da4ceec
A proper snooze
You might be a bit confused by this: there are two pools in the
IORuntime
, but haven’t we only been thinking about one?
So far, we’ve thought of the basicRuntime
and
boundedRuntime
functions as configuring a single
pool. In actual fact, they configure two: they both have a hard-coded
unbounded blocking pool. It’s just that we never used it.
By default, cats-effect’s IO
will always use the
compute
pool — this is the pool we set a bound on in
boundedRuntime
. If we want to tap into the blocking
pool, we must use a different constructor: the aptly named IO.blocking
.
Here’s a better snooze function:
val betterSnooze: IO[Unit] = IO.blocking(Thread.sleep(2000L)) val tenBetterSnoozes: IO[Unit] = List.fill(10)(betterSnooze).parSequence.void
Let’s run a few better snoozes using our
boundedRuntime
.
time(tenBetterSnoozes).unsafeRunSync()(boundedRuntime(numProcessors)) // res14: String = "The task took 2 seconds."
Our previous tenSnoozes
task took
four seconds on the
boundedRuntime
because it was run on the bounded
compute pool. On the other hand, tenBetterSnoozes
only
takes two seconds: it’s run on the unbounded blocking pool.
A better work-sleep balance
What happens if we interleave blocking operations with compute-intensive ones?
Let’s have a task composed of both:
val betterSnoozeAndCompute: IO[Unit] = List(tenFactorials, tenBetterSnoozes).parSequence.void
time(betterSnoozeAndCompute).unsafeRunSync()( boundedRuntime(numProcessors) ) // res15: String = "The task took 3 seconds."
It’s much faster: the threads in the bounded compute pool no longer
need to handle the Thread.sleep
, and the unbounded
blocking pool lets the betterSnooze
task scale unlimitedly.
The global IORuntime
We’ve explored a lot with our basicRuntime
and
boundedRuntime
functions. But we really wanted to know
about IORuntime.global
.
What’s special about it?
In actual fact, you’ve already used it: the global runtime is
effectively a runtime with a compute pool bounded at the number of
available processors. In other words, it’s the same as the
boundedRuntime(numProcessors)
we settled on
earlier.
Whenever you need to use a thread pool, you can rarely do better
than importing IORuntime.global
and making use of it.
The cats-effect IOApp
does this for you, so in most
cases you don’t even need to know that the IORuntime
exists.