Multi-tasking in Python: Speed up your program 10x by executing things simultaneously

Step-by-step guide to apply threads and processes to speed up your code

Mike Huls

Nov 18, 2021 — 7 min read

An army of workers to help us execute faster (image by Brian McGowan on Unsplash)

This article focuses on speeding up your program by making it do multiple things at the same time. We don’t have to idle while our program waits for and API response e.g; we can do something else in that same time! We’ll also get into how to apply more CPU’s to speed up calculation times. At the end of this article you’ll:

understand the difference ways of multi-tasking
know when to apply a which technique
be able to speed up your own code by using the code examples

Before we begin I’d strongly suggest to check out the article below. It explains how Python works under the hood and why it isn’t as fast as other languages. Also it reveals why isn’t Python multi-threaded to begin with? You’ll have a better understanding of what the problem we’re trying to solve in this article. Let’s code!

Threads and processes

Python can multi-task in two ways: threading and multiprocessing. On the surface they appear very alike but are fundamentally different. In the parts below we’ll examine both by using two simple metaphors. Our goal is to get an understanding of the differences between threads and processes so that we know when to use which.

Threading is like making breakfast

Let’s make some breakfast: we’ll need a boiled egg, some toast and a cup of coffee so we have 4 tasks:

toast bread
boil water
boil egg
switch on the coffee maker

This is what we’re trying to make in as little time as possible (image by Eiliv-Sonas Aceron on Unsplash)

How would you go about this? One way is to perform each task sequentially; first toast bread, then boil water and an egg an then switch on the coffee maker. Although this process is pretty understandable, in the end it just leaves us with some cold toast, a cold egg and a hot cup of coffee. Alternatively we could perform some tasks simultaneously; we’ll switch on the coffee maker and toaster and boil some water at the same time.

Let’s simulate this with some code.

We’ll run this code sequentially (one after the other) like this:toast_bread()
boil_water_and_egg()
make_some_coffee()

Making breakfast sequentially will take around 17.5 seconds. This involves a lot of waiting! Let’s multitask using threads:

The code is pretty straight-forward: we’ll create some tasks and append all of them to a list. Then we’ll start each thread in the list and wait for all of them to finish (this is what t.join() does). Making breakfast concurrently takes around 8 seconds!

Our threaded way of making breakfast (image by Author)

The main take-away is that if there’s a lot of waiting involved (typical in I/O tasks like downloading data, API requests, writing files..) we can use threads to multi-task. Later on in this article we’ll examine why threads are the best option for I/O-tasks.

Multiprocessing is like making your homework

When we apply the same breakfast-making principle to making our homework we run into a problem; doing math homework is a task that needs constant attention; we can’t start it and wait for it to finish! In order to do multiple subjects of homework at the same time we would need to clone ourselves, right? Multiprocessing does exactly that.

Time to focus and start processing (image by Annie Spratt on Unsplash)

Let’s first translate our homework example to some code. We’ll simulate doing our homework with some CPU-intensive processing; adding up all numbers from 0 to 100 million:

These are CPU-intensive functions; we have to perform a lot of calculations. We’ll first execute these functions sequentially, then threaded (same code as in the previous part) and then using the code below to spawn processes. Notice that the code looks a lot like the threading-code from the previous part.

The benchmaks:

sequentially: 14.22 seconds
using threads: 13.89 seconds
using processes: 6.00 seconds

You’ll see that using processes speeds up executing by quite a lot! The reason for this is that we can use more CPU’s.

In this example we have to to 300 million * 3 calculations. Threading this doesn’t speed it up because it’s still one CPU that has to perform 300 million calculations. When using processes however, we spawn a brand-new instance of Python on a different CPU. In other words we’ll use 3 CPU’s that can perform 100 million calculations each!

Summary

Threads are for making-breakfast-like tasks: it involves a lot of waiting so one ‘person’ (or CPU) can do thing simultaneously. Processes are for ‘thinking-tasks’; they need you to be there to do the heavy work. Multiprocessing is like creating a clone of yourself so it can do other things while you are working on your task.

Threading and multiprocessing under the hood

Now that we have a clearer understanding of how threads and processes work, let’s talk a bit about the differences between the two.

Let’s find out how they run (image by Erik Mclean on Unsplash)

The GIL — why threads are more suitable for I/O

As discussed threads are suitable for I/O tasks whereas processes are suitable for CPU-heavy tasks. The reason for this is Python’s infamous GIL; the Global Interpreter Lock. This lock ensures that Python runs single-threaded, blocking other processes that do not hold on to the lock. Many I/O processes release the GIL while idle, making threading possible. Check out this article to understand why Python applies the GIL.

In the homework-example threading makes no sense because the task involved is not an I/O-task. Because of the GIL only one thread can execute at any moment so it offers no speed-ups. When multiprocessing we create a fresh instance of Python which has its own GIL. This way processes run in parallel, speeding up the executing of our program significantly.

The main one is that processes can’t share resources while threads can. This is because a process works with multiple CPU’s whereas a thread is just one CPU going back and forth between multiple threads.

You can think of threading as a single CPU that first executes a few lines of code in thread1, then it executes some lines in thread2, then moves on to thread3. Then it executes the next line in thread1, then thread2 etc. Threading executes multiple tasks concurrently; one worker that switches between tasks. For the user it looks as though things happen simultaneously, this isn’t technically so.

When you spawn a new process a whole new instance of python is created and allocated to a different CPU. This is the reason why two processes cannot share a common resource. Processes run in parallel; there are multiple workers that work on multiple tasks simultaneously.

Overhead

Processes take a little more time to spawn. This is the reason the homework example is not three times faster but slightly less; first we’ll have to spawn the processes before we can benefit from the parallelism.

Even more speed

Multi-tasking can solve a lot of speed-issues in Python but sometimes it’s just not enough. Check out this or this article that show you how to compile a small part of your code for a 100x speed increase.

Conclusion

Threading and multiprocessing can be used to speed up the execution of your code in many, many cases. In this article we’ve explored what threads and processes are, how they work and when to use which. Don’t forget to check out this article on how to apply pools and for benchmarks!

If you have suggestions/clarifications please comment so I can improve this article. In the meantime, check out my other articles on all kinds of programming-related topics like these:

Happy coding!

— Mike

P.S: like what I’m doing? Follow me!