VOOZH about

URL: https://www.geeksforgeeks.org/numpy/filtering-and-aggregating-data-with-numpy/

⇱ Filtering and Aggregating Data with NumPy - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Filtering and Aggregating Data with NumPy

Last Updated : 14 Feb, 2026

Filtering and aggregating data with NumPy focuses on selecting required elements from arrays and computing summary values such as sum, mean or minimum. These operations are commonly used to analyze numerical data efficiently using simple NumPy functions.

Filtering Data

Filtering data in NumPy is done using boolean conditions applied directly to arrays. The result is a new array containing only those elements that satisfy the given condition.

1. Values Above a Limit: This operation selects all elements whose values are greater than a given number.

  • np.random.randint(1, 11, size=(5,5)) generates random integers from 1 (inclusive) to 11 (exclusive)
  • size=(5,5) creates a 5×5 NumPy array
  • arr > 5 creates a boolean mask where values greater than 5 are marked True
  • arr[arr > 5] extracts and returns only the elements that satisfy the condition

Output
[ 7 6 8 10 6 9 10 9 6 6]

2. Even-Valued Elements: This approach extracts only those elements that are evenly divisible by 2.

  • arr % 2 == 0 checks for even values and arr[arr % 2 == 0] returns only even elements.

Output
[ 4 2 2 4 2 10 10 4 8 8 4 2 6]

3. Multiple Conditions Combined: This method selects elements that satisfy more than one condition at the same time.


Output
[ 6 8 10 8 10]

4. Divisibility-Based Selection: This technique selects elements divisible by at least one of the specified numbers.

  • (arr % 3 == 0) | (arr % 7 == 0) logical OR condition

Output
[9 9 9 9 6 7 3 3 9]

5. Boolean Mask from Another Array: This method uses a separate boolean array to select specific rows.

  • arr1 boolean selector and arr[arr1] selects rows where value is True

Output
[[ 9 5 2 5 8]
 [10 3 1 5 3]
 [ 5 10 1 6 4]]

6. Condition Applied to a Single Row: This approach filters elements from a specific row based on a condition.

  • arr[2, :] selects the third row, arr[2, :] > 5 applies condition and arr[2][...] extracts matching values

Output
[6 8 7]

Aggregating Data

Aggregation in NumPy refers to computing summary statistics over arrays. Functions such as sum, mean, standard deviation, minimum and maximum help analyze data across the entire array or along specific axes.

1. Total Sum: This operation calculates the sum of all elements in an array and also demonstrates how summation works along rows and columns.

  • np.sum(arr) computes the sum of all elements in the array
  • axis=0 performs summation column-wise
  • axis=1 performs summation row-wise

Output
15
21
[5 7 9]
[ 6 15]

2. Average Value: This operation computes the mean (average) of array elements across the entire array or along a specified axis.

  • np.mean(arr) calculates the average of all values

Output
3.0
3.5
[2.5 3.5 4.5]
[2. 5.]

3. Spread of Values: This operation measures how much the values in the array vary from the mean using standard deviation.

  • np.std(arr) calculates standard deviation of all elements

Output
1.4142135623730951
1.707825127659933
[1.5 1.5 1.5]
[0.81649658 0.81649658]

4. Smallest and Largest Values: These operations identify the minimum and maximum values present in the array.

  • np.min(arr) returns the smallest value in the array
  • np.max(arr) returns the largest value in the array

Output
1
1
[1 2 3]
[1 4]
--------------------
5
6
[4 5 6]
[3 6]
Comment
Article Tags:
Article Tags:

Explore