Blocking and computing
The snooze task didn’t do any calculating — its only
operation was a Thread.sleep — so its
thread didn’t occupy a processor in order for the task to progress. As
long as we had an unbounded thread pool, an unlimited
number of snooze tasks could run at once, each on their
own thread. These sorts of tasks are known as “blocking”: the
application sits and waits for them to complete, but doesn’t actually
do any calculations. Blocking tasks are rare, and you should be very
reluctant to write one.
The factorial task, on the other hand, did a lot of
multiplication. Each task occupied one of my eight processors as it ran, so
only eight of those tasks could run at the same time. These sorts of tasks are
termed “compute-intensive”. While factorial doesn’t
resemble a typical Scala application, it’s more similar to one than
snooze.
Varying it up
This difference between blocking and compute-intensive tasks poses
a problem for us: what if we want to run a load of factorial
and snooze tasks at the same time?
val snoozeAndCompute: IO[Unit] = List(tenFactorials, tenSnoozes).parSequence.void
Which runtime should we choose?
If we use the basicRuntime, each task will be given
its own thread. This is good for the blocking snooze
task, but bad for factorial. But if we use a
boundedRuntime our snooze task will block
a thread that factorial could use to progress.
time(snoozeAndCompute).unsafeRunSync()(boundedRuntime(numProcessors)) // res12: String = "The task took 6 seconds."
As expected, using a boundedRuntime isn’t ideal.
How can we give the blocking snooze task unlimited scaling, but
bound the factorial task at eight threads?
Thankfully, there’s a way to get the best of both worlds. Instead of having just one thread pool, we could have two: an unbounded thread pool for blocking tasks and a bounded one for compute tasks.
It turns out that cats-effect 3 IORuntime supports
this exact use case. Let’s take a closer look at the setup code for
the boundedRuntime to see how. Here’s a simplified version:
def boundedRuntime(numThreads: Int): IORuntime = IORuntime( compute = IORuntime.createDefaultComputeThreadPool(numThreads), blocking = IORuntime.createDefaultBlockingExecutionContext() )
The IORuntime accepts two thread pool arguments:
compute and blocking. It uses these
thread pools for the compute-intensive and blocking operations
respectively.
We can access the compute thread pool using the
compute field. This gives us an ExecutionContext:
boundedRuntime(numProcessors).compute // res13: ExecutionContext = cats.effect.unsafe.WorkStealingThreadPool@4da4ceec
A proper snooze
You might be a bit confused by this: there are two pools in the
IORuntime, but haven’t we only been thinking about one?
So far, we’ve thought of the basicRuntime and
boundedRuntime functions as configuring a single
pool. In actual fact, they configure two: they both have a hard-coded
unbounded blocking pool. It’s just that we never used it.
By default, cats-effect’s IO will always use the
compute pool — this is the pool we set a bound on in
boundedRuntime. If we want to tap into the blocking
pool, we must use a different constructor: the aptly named IO.blocking.
Here’s a better snooze function:
val betterSnooze: IO[Unit] = IO.blocking(Thread.sleep(2000L)) val tenBetterSnoozes: IO[Unit] = List.fill(10)(betterSnooze).parSequence.void
Let’s run a few better snoozes using our
boundedRuntime.
time(tenBetterSnoozes).unsafeRunSync()(boundedRuntime(numProcessors)) // res14: String = "The task took 2 seconds."
Our previous tenSnoozes task took
four seconds on the
boundedRuntime because it was run on the bounded
compute pool. On the other hand, tenBetterSnoozes only
takes two seconds: it’s run on the unbounded blocking pool.
A better work-sleep balance
What happens if we interleave blocking operations with compute-intensive ones?
Let’s have a task composed of both:
val betterSnoozeAndCompute: IO[Unit] = List(tenFactorials, tenBetterSnoozes).parSequence.void
time(betterSnoozeAndCompute).unsafeRunSync()( boundedRuntime(numProcessors) ) // res15: String = "The task took 3 seconds."
It’s much faster: the threads in the bounded compute pool no longer
need to handle the Thread.sleep, and the unbounded
blocking pool lets the betterSnooze task scale unlimitedly.
The global IORuntime
We’ve explored a lot with our basicRuntime and
boundedRuntime functions. But we really wanted to know
about IORuntime.global.
What’s special about it?
In actual fact, you’ve already used it: the global runtime is
effectively a runtime with a compute pool bounded at the number of
available processors. In other words, it’s the same as the
boundedRuntime(numProcessors) we settled on
earlier.
Whenever you need to use a thread pool, you can rarely do better
than importing IORuntime.global and making use of it.
The cats-effect IOApp does this for you, so in most
cases you don’t even need to know that the IORuntime
exists.