Blocking and computing
snooze task didn’t do any calculating — its only
operation was a
Thread.sleep — so its
thread didn’t occupy a processor in order for the task to progress. As
long as we had an unbounded thread pool, an unlimited
snooze tasks could run at once, each on their
own thread. These sorts of tasks are known as “blocking”: the
application sits and waits for them to complete, but doesn’t actually
do any calculations. Blocking tasks are rare, and you should be very
reluctant to write one.
factorial task, on the other hand, did a lot of
multiplication. Each task occupied one of my eight processors as it ran, so
only eight of those tasks could run at the same time. These sorts of tasks are
termed “compute-intensive”. While
resemble a typical Scala application, it’s more similar to one than
Varying it up
This difference between blocking and compute-intensive tasks poses
a problem for us: what if we want to run a load of
snooze tasks at the same time?
val snoozeAndCompute: IO[Unit] = List(tenFactorials, tenSnoozes).parSequence.void
Which runtime should we choose?
If we use the
basicRuntime, each task will be given
its own thread. This is good for the blocking
task, but bad for
factorial. But if we use a
snooze task will block
a thread that
factorial could use to progress.
time(snoozeAndCompute).unsafeRunSync()(boundedRuntime(numProcessors)) // res12: String = "The task took 6 seconds."
As expected, using a
boundedRuntime isn’t ideal.
How can we give the blocking
snooze task unlimited scaling, but
factorial task at eight threads?
Thankfully, there’s a way to get the best of both worlds. Instead of having just one thread pool, we could have two: an unbounded thread pool for blocking tasks and a bounded one for compute tasks.
It turns out that cats-effect 3
this exact use case. Let’s take a closer look at the setup code for
boundedRuntime to see how. Here’s a simplified version:
def boundedRuntime(numThreads: Int): IORuntime = IORuntime( compute = IORuntime.createDefaultComputeThreadPool(numThreads), blocking = IORuntime.createDefaultBlockingExecutionContext() )
IORuntime accepts two thread pool arguments:
blocking. It uses these
thread pools for the compute-intensive and blocking operations
We can access the compute thread pool using the
compute field. This gives us an
boundedRuntime(numProcessors).compute // res13: ExecutionContext = cats.effect.unsafe.WorkStealingThreadPool@4da4ceec
A proper snooze
You might be a bit confused by this: there are two pools in the
IORuntime, but haven’t we only been thinking about one?
So far, we’ve thought of the
boundedRuntime functions as configuring a single
pool. In actual fact, they configure two: they both have a hard-coded
unbounded blocking pool. It’s just that we never used it.
By default, cats-effect’s
IO will always use the
compute pool — this is the pool we set a bound on in
boundedRuntime. If we want to tap into the blocking
pool, we must use a different constructor: the aptly named
Here’s a better snooze function:
val betterSnooze: IO[Unit] = IO.blocking(Thread.sleep(2000L)) val tenBetterSnoozes: IO[Unit] = List.fill(10)(betterSnooze).parSequence.void
Let’s run a few better snoozes using our
time(tenBetterSnoozes).unsafeRunSync()(boundedRuntime(numProcessors)) // res14: String = "The task took 2 seconds."
tenSnoozes task took
four seconds on the
boundedRuntime because it was run on the bounded
compute pool. On the other hand,
takes two seconds: it’s run on the unbounded blocking pool.
A better work-sleep balance
What happens if we interleave blocking operations with compute-intensive ones?
Let’s have a task composed of both:
val betterSnoozeAndCompute: IO[Unit] = List(tenFactorials, tenBetterSnoozes).parSequence.void
time(betterSnoozeAndCompute).unsafeRunSync()( boundedRuntime(numProcessors) ) // res15: String = "The task took 3 seconds."
It’s much faster: the threads in the bounded compute pool no longer
need to handle the
Thread.sleep, and the unbounded
blocking pool lets the
betterSnooze task scale unlimitedly.
The global IORuntime
We’ve explored a lot with our
boundedRuntime functions. But we really wanted to know
What’s special about it?
In actual fact, you’ve already used it: the global runtime is
effectively a runtime with a compute pool bounded at the number of
available processors. In other words, it’s the same as the
boundedRuntime(numProcessors) we settled on
Whenever you need to use a thread pool, you can rarely do better
IORuntime.global and making use of it.
IOApp does this for you, so in most
cases you don’t even need to know that the