Applying Python multiprocessing in 2 lines of code

When and how to use multiple cores to execute many times faster

Mike Huls

Aug 25, 2022 — 8 min read

Multiple engines executing at the same time (image by Peter Pryharski on Unsplash)

In this article we’ll multi-process a function in just 2 lines of code. In our case this will result in a significant speed-up of our code. First we’ll get into when multiprocessing is a good idea, then we’ll see how to apply 3 types of multiprocessing and discuss when to apply which. Let’s code!

But first..

Before we dive into our subject on how to apply multi-processing we’ll have to prepare a few things. First we’ll discuss some terms and determine when multiprocessing is the way to go.

Then we’ll create an example-function that we can use as a demonstration in this article.

Concurrency vs parallelism — threading vs multiprocessing

There are two ways to “do things at the same time” in Python: threading and multi-processing. In this article we’ll focus on the latter. A short difference:

threading runs code concurrently: we have one active CPU that quickly switches between multiple threads (check out the article below)
multiprocessing runs code in parallel: we have multiple active CPU’s that each run their own code

So in this article we’ll demonstrate how to run code in parallel. If you’re interested in the difference between threading and multiprocessing and when to apply which; check out the article below for a more in-depth explanation.

Generally speaking multiprocessing is the right idea if your code involves lots of calculations and if each process is more or less independent (so processes don’t have to wait for each other / need another process’ output).

Creating an example for this article

In this article we’ll pretend to have a company that does image-analysis. Customers can send us one or multiple images, we analyze and send them back.

At the moment we only have one function: get_most_popular_color(); it receives an image path, loads the image and calculates the most common color. It returns the path to the image, the rgb-value of the color and the percentage of pixels that have this color. Check out the source-code here.

This function is suitable for MP because it has to calculate a lot; it has to loop over very pixel in the image.

The code part: applying our function normally and using MP

We receive 12 images from our clients; they are all between 0.2 and 2 MB. I’ve stored all paths to the images in a string-array like below. Before we apply multiprocessing to our function we’ll run the get_most_popular_color() function normally so we’ll have something to compare against.image_paths = [
'images/puppy_1.png',
'images/puppy_2.png',
'images/puppy_3.png',
'images/puppy_4_small.png',
'images/puppy_5.png',
'images/puppy_6.png',
'images/puppy_7.png',
'images/puppy_8.png',
'images/puppy_9.png',
'images/puppy_10.png',
'images/puppy_11.png',
'images/puppy_12.png',
]

1. Normal way: running consecutively

The most obvious way to process our 12 images is to loop through them and process them one after the other:

Nothing special here. we just call the function for each image in an array of image_paths. After adding some printing to tell us a bit more about the execution time we get the following output:

As you see we process and print out the results of each image, one after the other. All in all the process takes a bit more than 8 seconds.

When to use this?
This method is suitable if you time doesn’t matter and you want your results in order, on after the other. As soon as image1 is ready we can send the results back to the client.

2. Normal way: using map

Another way of running the function is by applying Python’s map function. The main difference is that it blocks until all functions have executed, meaning that we can only access the results after image 12 is processed. This may seem like a downgrade but it helps us understand the next parts of this article better:

In the output below you’ll see that we can only access the results once each function has completed, in other words: using map on the functions blocks the results. You can see the consequence of this in the output:

All lines are printed at 8.324 seconds; the same time the whole batch takes. This proves that all functions have to complete before we can access the results.

When to use this?
When a single customer sends a batch of images we’ll want to process them and send back one message that contains all of the results. We are not going to send an email to a customer for each individual result.

3. Multiprocessing: map

We don’t want to wait for 8 seconds, that takes way too long! In this part we’ll apply a Pool object from the multiprocessing library. This simple and safe solution is very easy to apply in just 2 additional lines of code:

We’ll just take the code from the previous part, add a process pool and use the pool’s map function in stead of Python’s default one like in the previous part:

Isn’t it amazing that we can run our function in parallel with just 2 extra lines of code? Check out the results below:

The Pool.map function does the exact same thing as Python’s default map function: it executes all functions and only then you can access the results. You can see this by the fact that all results are printed on 1.873 seconds. The big difference is that it runs the function in parallel: it executes 12 function-calls at the same time, reducing executing time 4x to under 2 seconds!

When to use this?
Like method #2 the results are blocked (inaccessible) until all functions have completed. Since they all run in parallel now we only have to wait for 2 seconds in stead of 8. Still we can’t truly loop through the results like in #1 so this method is suitable for processing batches like in #2.

4. Multiprocessing with an iterator

In the previous part we’ve used the map function but there are alternatives. In this part we’ll check out imap. This function does roughly the same but in stead of blocking until all function-calls are complete it returns an iterator that is accessible as soon as one call is finished:

As you see the code is almost exactly the same as in the previous part, except we changed map to imap. Adding this one letter has some impact on the results however:

The difference is slight but noticeable: the time on which the functions are done is not the same for each function call like in the previous part. The imap function starts each call in parallel; spinning up a process for each one. Then it returns each result in order as soon as they’re ready. That is the reason why some calls finish so close after another and others take a bit longer. In this sense imap resembles the ‘normal’ Python way of executing a function like in #1.

When to use this?
When multiple clients each send a picture that we have to process we can now do so in parallel with the imap function. Think of this function as a parallel version of #1; it’s a normal loop but much faster.

5. Multiprocessing with an iterator ignoring input-order

The last method is imap_unordered. The call is almost identical:

Again this call closely resembles the previous part; it’s only difference is that it returns each result as soon as it’s ready:

We still finish in under 2 seconds but the order in which we finish is much different. Notice that images/puppy_4_small.png gets returned first. This is not surprising since this image is much smaller. It doesn’t have to await other function-calls that happen to be slower. Analyzing this output you might even notice that I’ve been lazy and copied our input-images.

When to use this?
This function is an upgraded version of #1 and #4: it resembles a normal for loop but it executes all functions in parallel ánd gives you access to the results as soon as any function is ready. With this function clients with small images don’t have to wait for big images to finish before receiving their results.

Limiting the number of processes/cores

It is very easy to limit the maximum number of processes/cores/CPU’s that the Pool will allow at any given time: just add the processes argument when you instantiate the Pool like below:with Pool(processes=2) as mp_pool:
... rest of the code

Conclusion

Adding multiple processes to make our code run in parallel isn’t difficult; the challenge lies in knowing when to apply which technique. In summary: the Pool object of the multiprocessing library offers three functions. map` is a parallel version of Python’s built-in map. The imap function returns an ordered iterator, accessing the results is blocking. The imap_unordered function returns an unordered iterator; making it possible to access each result as soon as it’s done, without waiting for another function fist.

I hope this article was as clear as I hope it to be but if this is not the case please let me know what I can do to clarify further. In the meantime, check out my other articles on all kinds of programming-related topics like these:

Happy coding!

— Mike

P.S: like what I’m doing? Follow me!