![]() |
VOOZH | about |
Prerequisite - Multiprocessing
It allows parallelism of code and the Python language has two ways to achieve its 1st is via multiprocessing module and 2nd is via multithreading module. From Python 3.2 onwards a new class called ProcessPoolExecutor was introduced in python in concurrent. The futures module to efficiently manage and create Process. But wait, if python already had a multiprocessing module inbuilt then why a new module was introduced. Let me answer this first.
Syntax:
concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context='', initializer=None, initargs=())
Parameters:
- max_workers: It is number of Process aka size of pool. If the value is None, then on Windows by default 61 process are created even if number of cores available is more than that.
- mp_context: It is the multiprocessing context, If None or empty then the default multiprocessing context is used. It allows user to control starting method.
- initializer: initializer takes a callable which is invoked on start of each worker Process.
- initargs: It's a tuple of arguments passed to initializer.
ProcessPoolExecutor Methods: ProcessPoolExecutor class exposes the following methods to execute Process asynchronously. A detailed explanation is given below.
The below code demonstrates the use of ProcessPoolExecutor, notice unlike with the multiprocessing module we do not have to explicitly call using a loop, keeping a track of the process using a list or wait for the process using join for synchronization, or releasing the resources after the Process are finished everything is taken under the hood by the constructor itself making the code compact and bug-free.
Output:
Cube of 2:8 Cube of 3:27 Cube of 6:216 Cube of 4:64 Cube of 5:125
The below code is fetching images over the internet by making an HTTP request, I am using the request library for the same. The first section of the code makes a one-to-one call to the API and i.e the download is slow, whereas the second section of the code makes a parallel request using multiple Processes to fetch API.
You can try all various parameters discussed above to see how it tunes the speedup for example if I make a Process pool of 6 instead of 3 the speedup is more significant.
Output:
Downloading images with single process Downloading.. Downloading.. Downloading.. Downloading.. Downloading.. Downloading.. Single Process Code Took :1.2382981777191162 seconds ************************************************** Downloading images with Multiprocess [Process ID]:118741 Downloading.. [Process ID]:118742 Downloading.. [Process ID]:118740 Downloading.. [Process ID]:118741 Downloading.. [Process ID]:118742 Downloading.. [Process ID]:118740 Downloading.. Multiprocess Code Took:0.8398590087890625 seconds