VOOZH about

URL: https://www.geeksforgeeks.org/numpy/numpy-interview-questions/

⇱ NumPy Interview Questions with Answers - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

NumPy Interview Questions with Answers

Last Updated : 20 Sep, 2025

NumPy is an open-source Python library used for numerical computing and handling large multi-dimensional arrays efficiently. In interviews, questions on NumPy are often asked to evaluate your understanding of array operations, mathematical functions and performance optimization. Below are some of the most frequently asked interview questions covering key NumPy topics.

1. What is NumPy and how to create a NumPy array?

NumPy is used for numerical and scientific computing. It offers support for arrays, matrices and a variety of mathematical operations that can effectively operate on these arrays.

We can create NumPy arrays using various methods. Here are some common ways to create NumPy arrays:

  1. Using np. array()
  2. np.zeros()
  3. np.ones()
  4. np.full()
  5. np.arange()
  6. np.linspace()

2. What are the main features of Numpy?

Here are some main features of the NumPy:

  1. Fast and Efficient
  2. Mathematical Functions
  3. Broadcasting
  4. Integration with other libraries
  5. Multi-dimensional arrays
  6. Indexing and Slicing
  7. Memory Management

3. How do you calculate the dot product of two NumPy arrays?

Calculating the dot product of two NumPy arrays we used numpy.dot() function and we also used the @ operator:

1. Using numpy.dot() function:

a: The first input array (NumPy array).
b: The second input array (NumPy array).

2. Using the @ operator

Both methods will return the dot product of the two arrays as a scalar value.

4. What is the difference between a shallow copy and a deep copy in NumPy?

In numPy we have two ways to copy an array. shallow copy and deep copy are two most used methods used in numpy to copy an array. Here is the main difference between both of them.

FeatureShallow CopyDeep Copy
DefinitionA new array that is a view of the original array's data.A completely new and independent array with its own copy of the data.
MemoryReferences the same memory location as the original array.Allocates new memory, duplicating the data.
DuplicationNo actual duplication of data; only references.Full duplication of data is created.
Effect of ChangesChanges in the original array reflect in the shallow copy and vice versa.Changes in the original array do not affect the deep copy and vice versa.

5. How do you reshape a NumPy array?

We can reshape a NumPy array by using the reshape() method or the np.reshape() function. it help us to change the dimensions of the array and keep all the elements constant.

1. Using the reshape() method:

2. Using the np.reshape() function:

In both cases, original_array is the existing NumPy array you want to reshape and new_shape is a tuple specifying the desired shape of the new array.

6. How to perform element-wise operations on NumPy arrays?

To perform element-wise operations on NumPy arrays, you can use standard arithmetic operators. NumPy automatically applies these operations element-wise when you use them with arrays of the same shape.

Output:

Addition: [ 7 9 11 13 15]
Subtraction: [-5 -5 -5 -5 -5]
Multiplication: [ 6 14 24 36 50]
Division: [0.16666667 0.28571429 0.375 0.44444444 0.5 ]
Power: [ 1 4 9 16 25]

7. How to generate random numbers with NumPy?

NumPy provides a wide range of functions for generating random numbers. You can generate random numbers from various probability distributions, set seeds for reproducibility and more. Here are some common ways to generate random numbers with NumPy:

1. Using np.random.rand()

Generating a Random Float between 0 and 1 using np.random.rand()

2. Using np.random.randint()

Generating a Random Integer within a Range using np.random.randint().

3. Using np.random.randn()

4. Using np.random.seed()

We can set a seed using np.random.seed() to ensure that the generated random numbers are reproducible.

8. How can you create a NumPy array from a Python list?

We can create a NumPy array from a Python list using the np.array() constructor provided by NumPy.

9. How can you access elements in a NumPy array based on specific conditions?

We can access elements in a NumPy array based on specific conditions using boolean indexing. Boolean indexing allows us to create true and false values based on a condition.

Output:

Selected Elements (greater than 3): [4 5]

10. What are some common data types supported by NumPy?

In NumPy there are so many data types that are used to specify the type of data which stored in array. This data type provide control that how data stored in memory during operations. Some common data types supported by NumPy include:

  1. int
  2. float
  3. complex
  4. bool
  5. object
  6. datetime

11. How can you concatenate two NumPy arrays vertically?

We can concatenate two NumPy arrays vertically (along the rows) using the np.vstack() function or the np.concatenate() function with the axis parameter set to 0. Here's how to do it with both methods:

1. Using np.vstack()

2. Using np.concatenate() with axis

12. What is Matrix Inversion in NumPy?

Matrix inversion in NumPy refers to the process of finding the inverse of a square matrix. The identity matrix is produced when multiplying the original matrix by the inverse of the matrix. In other words, if A is a square matrix and A^(-1) is its inverse, then A * A^(-1) = I, where I is the identity matrix.

NumPy provides a convenient function called numpy.linalg.inv() to compute the inverse of a square matrix. Here's how you can use it:

Output:

Original Matrix:

[[ 1 2 3]
[ 0 1 4]
[ 5 6 0]]

Inverse Matrix:

[[-24. 18. 5.]
[ 20. -15. -4.]
[ -5. 4. 1.]]

13. Define the var and mean function in NumPy.

In NumPy, the var function is used to compute the variance of elements in an array or along a specified axis. Variance is a measure of the spread or dispersion of data points.

  • a: The input array for which you want to calculate the variance.
  • axis: Axis or axes along which the variance is computed. If not specified, the variance is calculated for the whole array. It can be an integer or a tuple of integers to specify multiple axes.
  • dtype: The data type for the returned variance. If not specified, the data type is inferred from the input array.

The arithmetic mean (average) in NumPy can be calculated using numpy.mean(). This method tallies elements in an array, whether it be along a specified axis or the whole array, if no axis is explicitly mentioned. The summation of all elements is then divided by the overall number of elements which provides the average.

  • a: The input array for which you want to calculate the mean.
  • axis : The axis or axes along which the mean is computed. If not specified, the mean is calculated over the entire array.

14. Convert a multidimensional array to 1D array.

You can convert a multidimensional array to a 1D array which is also known as flattening the array in NumPy using various methods. Two common methods are using for the Convert a multidimensional array to 1D array.

1. Using flatten():

Output:

one dimensional array [1 2 3 4 5 6 7 8 9]

2. Using ravel():

Output:

one dimensional array [1 2 3 4 5 6 7 8 9]

Both of these methods will flatten the multidimensional array into a 1D array. The primary difference between them:

  • Flatten() returns a new copy of the array. Any modifications in the flattened array do not affect the original array
  • Ravel() returns a flattened view of the original array whenever possible. Changes made to the raveled array may affect the original array since they share the same data in memory.

15. How can you identify outliers in a NumPy array?

Identifying and removing outliers in a NumPy array involves several steps. Outliers are data points that significantly deviate from the majority of the data and can adversely affect the results of data analysis. Here's a general approach to identify and remove outliers:

Identifying Outliers:

1. Calculate Descriptive Statistics: Compute basic statistics like the mean and standard deviation of the array to understand the central tendency and spread of the data.

Output:

Outliers: [300]

2. Using IQR: IQR (Interquartile Range) is the difference between the 75th percentile (Q3) and the 25th percentile (Q1), representing the spread of the middle 50% of the data.

Output:

Outliers: [ 10 300]

16. How do you remove missing or null values from a NumPy array?

We can remove null values using numpy.isnan() method.

Output:

Original Array: [ 1. 2. nan 4. nan 6.]
Filtered Array (without NaNs): [1. 2. 4. 6.]

We can filter out missing or null data using a masked array or a boolean mask.

Output:

[1. 2. 4. 5.]

17. What is the difference between slicing and indexing in NumPy?

In NumPy, both slicing and indexing are fundamental operations for accessing and manipulating elements in arrays, but there are some main difference are avialable.

FeatureSlicingIndexing
DefinitionExtracts a range/subset of elements from an array.Accesses specific elements or subsets from an array.
SyntaxUses a colon (:) inside square brackets (e.g., arr[1:5]).Uses square brackets with index values (e.g., arr[2], arr[1, 3]).
OutputProduces a contiguous block of elements.Produces a single element or a set of specific elements.
Use CaseWhen you want a continuous slice of data.When you want random or specific positions.
Examplearr[2:6] β†’ elements from index 2 to 5.arr[0] β†’ first element, arr[[1,3,5]] β†’ elements at indices 1, 3, and 5.

18. How can you create array with same values.

We can create a NumPy array with the same values using various functions and methods depending on your specific needs. Here are a few common approaches:

1. Using numpy.full():

You can use the numpy.full() function to create an array filled with a specific value. This function takes two arguments: the shape of the array and the fill value.

2. Using Broadcasting:

If you want to create an array of the same value repeated multiple times, you can use broadcasting with NumPy.

3. Using list comprehension:

You can also create an array with the same values using a list comprehension and then converting it to a NumPy array.

19. What is a masked array in NumPy.

A masked array in NumPy is a special type of array that includes an additional Boolean mask, which marks certain elements as invalid or masked. This allows you to work with data that has missing or invalid values without having to modify the original data. Masked arrays are particularly useful when dealing with real-world datasets that may have missing or unreliable data points.

Example: Creating and Using a Masked Array

Output:

Original Data: [ 1 2 -999 4 5]
Masked Data: [1 2 -- 4 5]
Mean (ignoring masked values): 3.0

20. What is broadcasting in numpy?

Broadcasting in NumPy is the ability of NumPy to perform arithmetic operations on arrays of different shapes and sizes without explicitly replicating the data.

  • If two arrays have different shapes, NumPy automatically expands the smaller array along the mismatched dimensions so they can be combined.
  • This makes code more efficient and avoids unnecessary memory usage.

1. Broadcasting Scalar

Output:

[11 12 13 14 15]

2. Arrays with Different Shapes

Output:

[[11 22 33]
[14 25 36]]

21. How do you sort a NumPy array in ascending or descending order?

To arrange a NumPy array in both ascending and descending order we use numpy.sort() to create an ascending one and numpy.argsort() for a descending one. Here’s how to do it:

1. Ascending Order: You can use the numpy.sort() function to sort your array in ascending order. The function will return a new sorted array, while still leaving the original array unchanged.

Output:

Ascending: [1 2 3 4 5]

2. Sorting in Descending Order: To sort a NumPy array in descending order, you can use the numpy.argsort() function to obtain the indices that would sort the array in ascending order and then reverse those indices to sort in descending order.

Output:

Descending: [ 5. 4. 3. 2. 1. nan]

22. How are NumPy Arrays better than Lists in Python?

NumPy arrays offer several advantages over Python lists when it comes to numerical and scientific computing. Here are some key reasons why NumPy arrays are often preferred:

  1. Performance
  2. Vectorization
  3. Broadcasting
  4. Multidimensional Arrays
  5. Memory Management
  6. Standardization

23. Difference between np.reshape() and np.resize()

Featurereshape()resize()
DefinitionReturns a new view or copy of the array with a new shape.Modifies the array itself (in-place) to match the new shape.
Original ArrayDoes not change the original array unless inplace modification is forced.Changes the original array directly.
Return ValueReturns the reshaped array (new object).Returns None (operation done in-place).
Data HandlingRequires that the total number of elements match the new shape.If new size is bigger β†’ fills with zeros. If smaller β†’ array is trimmed.
MemoryOften returns a view (shares data) if possible, else a copy.Creates/reallocates memory if needed.
Use CaseWhen you want a reshaped version of an array without altering the original.When you want to permanently change the shape of the array, even if padding or truncating is needed.

24. Discuss uses of vstack() and hstack() functions?

These functions are used for combining arrays in different dimensions and are widely used in various data processing and manipulation tasks.

Featurevstack()hstack()
DefinitionStacks arrays vertically (row-wise).Stacks arrays horizontally (column-wise).
AxisOperates along axis=0.Operates along axis=1.
RequirementArrays must have the same number of columns.Arrays must have the same number of rows.
Output ShapeIncreases the number of rows.Increases the number of columns.
Examplenp.vstack(([1,2,3], [4,5,6])) β†’ [[1,2,3],[4,5,6]]np.hstack(([1,2,3], [4,5,6])) β†’ [1,2,3,4,5,6]
Use CaseUseful when combining data points with same features.Useful when combining features/variables for same data points.

25. How to Get the eigen values and determinant of a matrix.

With the help of np.eigvals() method, we can get the eigen values of a matrix by using np.eigvals() method.

The Determinant of a square matrix is a unique number that can be derived from a square matrix. Using the numpy.linalg.det() method, NumPy gives us the ability to determine the determinant of a square matrix.

26. How to compare two NumPy arrays?

Method 1: Using == operator

We generally use the == operator to compare two NumPy arrays to generate a new array object. Call ndarray.all() with the new array object as ndarray to return True if the two NumPy arrays are equivalent.

Output:

True
False

Method 2: Using array_equal()

This array_equal() function checks if two arrays have the same elements and same shape.

27. Calculate the QR decomposition of a given matrix using NumPy.

A matrix's decomposition into the form "A=QR," where Q is an orthogonal matrix and R is an upper-triangular matrix and it is known as QR factorization. We can determine the QR decomposition of a given using matrix.linalg.qr().

  • a: matrix(M,N) which needs to be factored.
  • mode: it is optional.

28. What are ndarrays in NumPy?

An ndarray also known as "N-dimensional array" is a fundamental data structure used in NumPy for effectively storing and manipulating data, particularly numerical data. It is:

  • Multidimensional: Can represent 1D, 2D, 3D or higher-dimensional arrays.
  • Homogeneous: All elements must have the same data type.
  • Efficient: Optimized for mathematical and array-oriented operations.

29. What is Vectorization in Numpy?

Vectorization in NumPy means performing operations on entire arrays or vectors at once without using explicit loops. NumPy internally uses optimized C code, so vectorized operations are much faster than iterating through elements in Python.

  • Eliminates the need for for loops.
  • Operations are applied element-wise on the whole array.
  • Improves performance and makes code more concise.

Output:

Using loop: [1, 4, 9, 16, 25]
Using vectorization: [ 1 4 9 16 25]

30. Difference between np.copy(), view() and = assignment?

Feature= Assignment.view().copy()
DefinitionJust creates a new reference to the same array object.Creates a shallow copy (new object but shares same data buffer).Creates a deep copy (new object with its own data).
MemorySame memory location (no duplication).Different object, but data points to the same memory.Completely independent memory allocation.
Object IDBoth variables have the same object ID.Different object IDs, but share underlying data.Different object IDs and separate memory.
Effect of ChangesChanges in one array reflect in the other.Changes in data reflect in both arrays, but attributes like shape are independent.No effect as arrays are independent.
SpeedFastest (no copy at all).Faster than deep copy (just metadata copy).Slower (data duplication happens).
Argumentb = ab = a.view()b = a.copy()

31. What is the difference between shape and size attributes of NumPy array.

Featureshapesize
DefinitionReturns a tuple representing the dimensions of the array.Returns the total number of elements in the array.
Type of OutputTuple for 2D arrays.Integer (single value).
Information ProvidedGives details about array dimensions, like number of rows, columns, etc.Gives overall element count, ignoring shape.
CalculationTuple shows the length along each axis.Product of all dimensions in the shape tuple.
Argumentarr.shape β†’ (3, 4)arr.size β†’ 12
Use CaseUseful to know the structure of the array.Useful to know the total elements for operations like reshaping or flattening.

32. What is difference between python sequences, pandas array and numpy array?

FeaturePython Sequences (list, tuple)NumPy Array (ndarray)Pandas Series/Array
Data TypeCan hold mixed data types like [1, "a", 3.5].Holds homogeneous data (all elements of same dtype).Mostly homogeneous (like NumPy), but can also hold mixed or object dtype.
DimensionalityMostly 1D (lists/tuples). Nested lists can simulate higher dimensions but inefficient.Supports n-dimensional arrays.Mainly 1D (Series) or 2D (DataFrame); built on top of NumPy.
PerformanceSlower, not memory-efficient (pure Python objects).Fast, memory-efficient (C-based implementation).Slightly slower than NumPy due to extra features but optimized for labeled data.
IndexingZero-based indexing and supports slicing.Supports advanced indexing, slicing, boolean masking, broadcasting.Supports indexing + labels, powerful alignment, missing value handling.
OperationsElement-wise operations require loops or comprehensions.Vectorized element-wise operations, linear algebra, broadcasting.Vectorized operations (inherited from NumPy) + axis-aware operations.
Use CaseGeneral-purpose container.Numerical computation, scientific computing, machine learning.Data analysis, tabular data handling, missing values, statistics.
Argumentslst = [1, 2, 3]np.array([1, 2, 3])pd.Series([1, 2, 3])

33. How would you convert a pandas dataframe into NumPy array.

You can use the DataFrame's.values attribute to convert a Pandas DataFrame into a NumPy array.

Output:

[[1 4]
[2 5]
[3 6]]

34. How would you reverse a numpy array?

We can reverse a NumPy array using the [::-1] slicing technique.

Output:

[5 4 3 2 1]

35. Why NumPy is faster than list?

NumPy arrays are much faster than Python lists because of the way they are implemented:

  • Homogeneous Data: NumPy arrays store elements of the same data type, unlike lists that can store mixed types. This allows NumPy to use fixed-size memory blocks.
  • Contiguous Memory Allocation: NumPy stores data in continuous blocks of memory making element access and operations faster due to better CPU cache utilization.
  • Vectorization: Operations in NumPy are implemented in C and use vectorized code, so computations are applied to the whole array at once instead of looping in Python.
  • Low-Level Optimizations: NumPy relies on optimized C and Fortran libraries (like BLAS, LAPACK) which are much faster than Python’s built-in loops.

Output:

Python List Time: 0.05335259437561035
NumPy Array Time: 0.004484653472900391

36. What is the procedure to count the number of times a given value appears in an array of integers?

The bincount() function can be used to count the instances of a given value. It should be noted that the bincount() function takes boolean expressions or positive integers as arguments. Integers that are negative cannot be used.

37. How can you find the maximum or minimum value of an array in NumPy?

Using the max and min functions, we can determine array's maximum or minimum value in NumPy. These operations accept an array as an input and output the array's maximum or minimum value.

Output:

max value: 3
min value: 1

38. How slicing and indexing can be used for data cleaning?

Both indexing and slicing are useful methods for cleaning data because they let you modify or filter data based on particular criteria or target particular data points for modification. In this example, negative values are located and replaced with zeros using indexing and a new array with more than two members is created using slicing.

Output:

After replacing negatives with zeros: [1 2 0 4 5 0 7]
Subset with elements greater than 2: [4 5 7]

39. How can you find the unique elements in an array in NumPy?

Apply the unique function from the NumPy module to identify the unique elements in an array in NumPy. This function returns the array's unique elements in sorted order.

Output:

[1 2 3 4 5 6 7]

Comment

Explore