Advanced multi-tasking in Python: Applying and benchmarking thread pools and process pools in 6…
Safely and easily apply multi-tasking to your code
Advanced multi-tasking in Python: Applying and benchmarking thread pools and process pools in 6 lines of code
Why execute something sequentially when your machine can multi-task? Using threads of processes you can greatly increase the speed of your code by running things simultaneously. This article will show you a safe and easy way to implement this wonderful technique in Python. At the end of this article you’ll:
- understand which tasks are suitable for multi-tasking
- know when to apply a thread pool or a process pool
- be able to brag to coworkers and friends about speeding up executing 10x with just a few simple lines of code
Before we begin I’d strongly suggest first taking a look at this article to understand how Python works under the hood. Why isn’t Python multi-threaded to begin with? It shows you what the problem is we’re trying to solve in this article. I also recommend checking out this article that explains the difference between threads and processes.
Why use a pool?
Pools are ideal for applications where you want to protect your number of workers. Imagine you run a function in an API that creates 5 workers to handle the provided data. What if your API suddenly receives 500 requests in a single second? Creating 2.500 workers that all perform heavy tasks may kill your computer.
Pools prevent your computer from being killed like that by limiting the number of workers that can be created. In the API example, you might want to create a pool with a maximum of 50 workers. What happens when 500 requests come in? Only 50 workers get created. Remember that each request takes 5 workers? This means that only the first 10 requests get handled. Once a worker is done and returns to the pool it can be sent out again.
In summary: pools make sure that no more than a certain number of workers are active at any given time.
Creating the pool
We can easily create pools for both threads and processes with the concurrent library. In the part below we’ll get into the code. In the end, we’ll have an example of:
- how to create a pool
- limit the max number of workers for the pool
- map the target function to the pool so that workers can execute it
- collect the results of the functions
- await all workers to finish before continuing
Setup
As you can read in this article threading is more suitable for IO-type tasks (waiting concurrently) while processes are best suited for CPU-heavy tasks (using more CPU’s). In order to properly test our code, we’ll define two types of functions: one is IO-heavy, the other is CPU-heavy:
These are our target functions; the functions that do the heavy lifting. Notice that both have a printout parameter. This is not particularly useful
but we’ll need it in order to demonstrate that we can pass additional (keyword-) arguments.
Thread pooling IO heavy tasks
We’ll start with the I/O-heavy function. We have a list of 100 numbers that we would like to pass to this function. Sequentially the code looks like this:
If we execute the code above it will take approximately 100 seconds to execute (100 calls, every 2 seconds). Let’s use a ThreadPool to run the same code:
As you can see we use the concurrent.futures.ThreadPoolExecutor. We define that we take 10 workers max and then loop through our range, creating a thread for each number. As soon as each thread is completed we’ll add the result to the the_sum variable and return it in the end. Executing this code takes about 10 seconds. This is not surprising since we have 10 workers so we should be 10 times as fast as running the code sequentially.
Below you’ll find the same code just differently formatted.
We’ll define the call to our function as a partial function so we can map it to the executor. The third line executes the partial function for every value in range(10).
CPU heavy tasks
When it comes to the code for mapping the CPU heavy function to a ProcessPool we can be brief: the code is very similar.
Just swap out the function in line to (to cpu_heavy_task) and switch ThreadPoolExecutor to ProcessPoolExecutor in line 3. That’s it!
Benchmarking
Let’s put these functions to the test! We’ll execute the IO-heavy and CPU-heavy functions sequentially, threaded and multiprocessing.
Here are the results:IO heavy function
sequential: took 100.44 seconds
threaded: took 10.04 seconds (max pool size = 10)
processed: took 10.20 seconds (max pool size = 10)
CPU heavy function
sequential: took 27.89 seconds
threaded: took 26.65 seconds (max pool size = 10)
processed: took 6.58 seconds (max pool size = 10)
As explained in this article these results are as expected. Sequentially is, of course, the slowest, executing all of the functions one by one.
Threading the IO heavy function is 10 times faster because we have 10 times as many workers. Processing the IO-heavy function is about as fast as the 10 threads. It’s a little bit slower because the processes are more ‘expensive’ to set up. Notice that, although both equally fast, threads are the better option here since they provide the ability to share resources.
When we benchmark the CPU-heavy function we see that threading is about as fast as the sequential method. This is due to the GIL as explained in this article. Processes are much more efficient at handling CPU-heavy tasks, resulting in a ~4.3x speed-up.
Also, notice that our pool size was fairly small in these cases and can be tweaked (depending on your target function) to create an ever faster program!
Conclusion
As you have read in this article, it’s pretty easy to work with pools. In addition, I hope to have shed a light on why and how to use them. Sometimes multi-tasking is just not enough. Check out this or this article that shows you how to compile a small part of your code for a 100x speed increase (that can also be multi-tasked).
If you have suggestions/clarifications please comment so I can improve this article. In the meantime, check out my other articles on all kinds of programming-related topics like these:
- Multi-tasking in Python: speed up your program 10x by executing things simultaneously
- Create a fast auto-documented, maintainable and easy-to-use Python API in 5 lines of code with FastAPI
- Python to SQL — UPSERT Safely, Easily and Fast
- Create and publish your own Python package
- Create Your Custom, private Python Package That You Can PIP Install From Your Git Repository
- Virtual environments for absolute beginners — what is it and how to create one (+ examples)
- Dramatically improve your database insert speed with a simple upgrade
Happy coding!
— Mike
P.S: like what I’m doing? Follow me!