![]() |
VOOZH | about |
Vectorization in NumPy refers to applying operations on entire arrays without using explicit loops. These operations are internally optimized using fast C/C++ implementations, making numerical computations more efficient and easier to write.
Vectorization is important because it:
Performs element-wise addition across the entire array without using loops, making the operation fast and efficient.
[ 4 6 8 10 12]
Performs element-wise addition of two NumPy arrays.
[5 7 9]
Multiplies each element in the array by a constant value using fast vectorized array operations instead of loops.
[2 4 6 8]
Logical operations such as comparisons can be applied directly to arrays.
[False True True]
Explanation: Performs element-wise comparison, returning a boolean array indicating which elements are greater than 15.
NumPy supports vectorized matrix operations like dot products and matrix multiplications using functions such as np.dot and @.
[[19 22] [43 50]]
Explanation: Performs matrix multiplication (dot product) between a1 and a2.
np.vectorize applies a custom function element-wise to a NumPy array, e.g., computing x² + 2x + 1 for each element efficiently.
[ 4 9 16 25]
Explanation: Performs the operation x**2+2*x+1 element-wise on the array a1 using NumPy’s vectorized arithmetic.
Operations like sum, mean, max are optimized with much faster than the traditional Python approach of looping through elements.
6 2.0
Explanation: Calculates the sum (r1) and mean (r2) of all elements in the array a1 using NumPy’s vectorized aggregation functions.
When working with large datasets, performance matters. In Pandas and NumPy, vectorization is almost always faster than writing manual Python loops. This is because vectorized operations are executed in optimized C code internally, while Python loops run line-by-line in Python (much slower).
Example: We will create a large NumPy array and apply the same operation (multiply each element by 2) using both:
Loop Time: 0.14799761772155762 Vectorized Time: 0.04310011863708496
Explanation:
Vectorization is significantly faster because operations happen in optimized low-level code instead of Python's slow element-by-element loop.