Thread Your Python Program with Two Lines of Code

Speed up your program by doing multiple things simultaneously

Thread Your Python Program with Two Lines of Code
Better get our threads organized (image by Karen Penroz on Unsplash)

When your program has a lot of tasks that involve waiting you can speed up your program by executing those tasks simultaneously instead of one by one. When making breakfast you don’t wait for the coffee machine to finish before cooking an egg. Instead you flick on the coffee maker and pour yourself a cup of orange juice while heating up the pan for the scrambled eggs.

This article shows you how to do precisely that. At the end you’ll be able to safely apply threading in 2 lines of code and achieve a huge speedup in your program. Let’s code!


But first..

This article will detail how to apply threads by applying the same function to a whole list of arguments. Then we’ll check out how to apply different functions in a threaded way.

Cython for absolute beginners: 30x faster code in two simple steps
Easy Python code compilation for blazingly fast applications

Are threads going to solve my problem? Understanding concurrency

It is true that in many cases your program can be sped up by doing “multiple things at the same time” but blindly applying threads everywhere isn’t a smart solution. There are two ways to multi-task in Python: multiprocessing and threading:

  • threading runs code concurrently: we have one active CPU that quickly switches between multiple threads
  • multiprocessing runs code in parallel: we have multiple active CPU’s that each run their own code (check out the article below)
Applying Python multiprocessing in 2 lines of code
When and how to use multiple cores to execute many times faster

When threading you have one actor that executes all tasks simultaneously by switching between them. In the context of the breakfast-example from the intro: there is one actor (you) that switches between the coffee maker, the pan and the glass of orange juice.
When multiprocessing you active multiple that you each give a task. In the breakfast analogy it’s like cloning yourself twice and giving each clone a separate task. Although it will also be much faster than running the tasks one-by-one, multiprocessing has a bit more overhead; cloning yourself is a lot of effort just to have the clones waiting for a pan to heat up!

In short: multiprocessing is the best solution in situations when we have to calculate a lot, threading is more suitable for when we have to wait a lot.

In this article we’ll focus on threading; check out the article below if you’re interested in multiprocessing:

Multi-tasking in Python: Speed up your program 10x by executing things simultaneously
Step-by-step guide to apply threads and processes to speed up your code

Setup

For this article we’ll imagine that a tour program receives a big list of email-addresses that we have to validate. Imagine we’ve set up an API that we can send an email address and returns a true/false depending on whether the email address is valid.

The most important thing is that we have to send requests and wait for the API to respond. This is a typical task that we can multi-thread: we don’t need extra cores to calculate faster; we just need some extra threads to send multiple email addresses at a time.

For this article we’ll use this list of email addresses:

email_addresses = [ 
  'mikehuls42@gmail.com', 
  'mike@mikehuls.com', 
  'johndoe@some_email.com', 
  'obviously_wrong@address', 
  'otheraddress.com', 
  'thisis@@wrong.too', 
  'thisone_is@valid.com' 
]

And this will be our function that simulates sending the email address to the validation API:

def send_email_address_to_validation_api(email_address:str): 
  # We'll simulate the request to the validation API by just sleeping between 1 and 2 seconds 
  sleep_time = random.random() + 1 
  time.sleep(sleep_time) 
  # Randomly return a true / false depending on the sleep_time 
  return sleep_time > 1.5
Docker for absolute beginners: the difference between an image and a container
Learn the difference between Docker images and containerscontainers and images are different + practical code examples

A. Non-threaded

Let’s first see how we use this function without using threads.

Loop through email addresses

We’ll just loop through the list of our 7 email addresses and send each value to the API; dead simple:

for email_address in email_addresses: 
  is_valid = send_email_address_to_validation_api(email_address=email_address) 
  # do other stuff with the email address and validity

This is pretty easy to understand but is it fast? (spoiler: no). Since we validate each of our 7 email addresses consecutively, and each one takes between 1 and 2 seconds, it takes anywhere between 7 and 14 seconds. I’ve timed it at 11.772 seconds.

Destroying Duck Hunt with OpenCV — image analysis for beginners
Write code that will beat every Duck Hunt high score

Use the map function

In order to better understand the next part we’ll rewrite the code above using Python’s map function:

results: [bool] = map(send_email_address_to_validation_api, email_addresses)

The code above does exactly the same; it maps the function to the list of addresses which means that it executes the function for each value in the email_addresses list.

Let’s add the time to our benchmark:

NON THREADED           11.772 secs
Why Python is so slow and how to speed it up
Take a look under the hood to see where Python’s bottlenecks lie

B. Using threads

In this part we check out 3 different ways of applying threads to our function. All make use of a thread pool which can be imported with:

from multiprocessing.pool import ThreadPool

Think of the thread pool as a number of threads that are waiting for a task. A thread pool has a map function that we can use just like in the unthreaded example above. As soon as a thread is finished with the task it returns to the pool, waiting for another task.

The thread pool allows us to apply threads easily and safely by providing a limit on how many thread can exit in the pool

Python to SQL — UPSERT Safely, Easily and Fast
Lightning-fast insert and/or update with Python

1. Threadpool map

We’ll first switch to the map function supplied by the thread pool.

with ThreadPool(processes=10) as t_pool: 
  results = t_pool.map(send_email_address_to_validation_api, email_addresses)

As you we define a thread pool with a maximum of 10 processes. Because of this the map function starts all calls to the function simultaneously. As soon as all workers are done we can assess the results, which is after 1.901 seconds in this case.

NON THREADED           11.772 secs 
THREADED MAP            1.901 secs
Create and publish your own Python package
A short and simple guide on how to pip install your custom made package

2. Threadpool imap

In the previous example we had to wait for all function calls to finish. This is not the case if we imap in stead of map. The imap function returns an iterator that we can access a soon as the results are available:

strt_time_t_imap = time.perf_counter() 
with ThreadPool(processes=10) as t_pool: 
  for res in t_pool.imap(send_email_address_to_validation_api, email_addresses): 
    print(time.perf_counter() - strt_time_t_imap, 'seconds')

The code above is almost exactly the same. The only differences are that some timing code has been added. Also we obviously use the imap function on the t_pool on line 3.

If we check out our print results we see this:

1.4051628 seconds 
1.4051628 seconds 
1.7985222 seconds 
1.7985749 seconds 
1.7985749 seconds 
1.7985957 seconds 
1.7986305 seconds

The imap function returns an iterator that we can access as soon as our results are done. These results are returned in order though. That means that e.g. the second email address has to wait for the first; if the second email address is done in 1.3 seconds and the first one in 1.4; both are returned after 1.4 (as you’ll see in the print outputs above).

Although the validation of the full list of email_addresses is completed in roughly the same time as the previous example; we can access the results much faster! The first result is accessible after 1.4 seconds!

NON THREADED           11.772 secs 
THREADED MAP            1.901 secs 
THREADED IMAP           1.901 secs(first result accessible after 1.4  secs)
Virtual environments for absolute beginners — what is it and how to create one (+ examples)
A deep dive into Python virtual environments, pip and avoiding entangled dependencies

3. Threadpool imap_unordered

One more improvement: instead of returning the iterator in order we’ll return it unordered:

strt_time_t_imap = time.perf_counter() 
with ThreadPool(processes=10) as t_pool: 
  for res in t_pool.imap_unordered(send_email_address_to_validation_api, email_addresses): 
    print(time.perf_counter() - strt_time_t_imap, res)

With the code above we can access the results as soon as they are available. you can also see this in the print output:

1.0979514 seconds 
1.2382307 seconds 
1.3781070 seconds 
1.4730333 seconds 
1.7439070 seconds 
1.7909826 seconds 
1.9953354 seconds

It’s pretty possible that the last email address completes in 1.09 seconds and is returned first. This is very convenient.

NON THREADED           11.772 secs 
THREADED MAP            1.901 secs 
THREADED IMAP           1.901 secs(first result accessible after 1.4  secs) 
THREADED IMAP_UNORDERED 1.901 secs(first result accessible after 1.09 secs)
Create a fast auto-documented, maintainable and easy-to-use Python API in 5 lines of code with…
Perfect for (unexperienced) developers who just need a complete, working, fast and secure API

4. Different functions

In the previous examples we’ve gone through how to apply the same function in a threaded way but what if we have multiple ones? In the example below we simulate loading a web-page. We have different functions for loading banners, ads, posts and, of course, clickbait:

def load_ad(): 
    time.sleep(1) 
    return "ad loaded"

def load_clickbait():
   time.sleep(1.5)
   return "clickbait loaded"def load_banner():
   time.sleep(2)
   return "banner loaded"def load_posts():
   time.sleep(3)
   return "posts loaded"

If we run these consecutively our program will take around 7.5 seconds. We can use the thread pool with its map, imap and imap_unordered functions with a small adjustment. See the imap_unordered example below:

with ThreadPool(processes=4) as t_pool:  # limit to 4 processes as we only need to execute  
  results = t_pool.imap_unordered(lambda x: x(), [load_ad, load_posts, load_banner, load_clickbait])

As you see we map a list of the functions to a lambda function. The list of functions are executed by the lambda function (the x is a placeholder for each function and the x() will execute it). Executing this way rendering out webpage only takes 3.013 seconds.

Git for absolute beginners: understanding Git with the help of a video game
Get an intuition about how to use git with a classic RPG as an analogy

Conclusion

Multithreading with a thread pool is save and easy to apply. In summary: the Pool object of the multiprocessing library offers three functions. map is a concurrent version of Python’s built-in map. The imap function returns an ordered iterator, accessing the results is blocking. The imap_unordered function returns an unordered iterator; making it possible to access each result as soon as it’s done, without waiting for another function fist.

I hope this article was as clear as I hope it to be but if this is not the case please let me know what I can do to clarify further. In the meantime, check out my other articles on all kinds of programming-related topics like these:

Happy coding!

— Mike

P.S: like what I’m doing? Follow me!

Join Medium with my referral link — Mike Huls
Read every story from Mike Huls (and thousands of other writers on Medium). Your membership fee directly supports Mike…