VOOZH about

URL: https://www.geeksforgeeks.org/machine-learning/ml-handle-missing-data-with-simple-imputer/

⇱ ML | Handle Missing Data with Simple Imputer - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

ML | Handle Missing Data with Simple Imputer

Last Updated : 28 Sep, 2021

SimpleImputer is a scikit-learn class which is helpful in handling the missing data in the predictive model dataset. It replaces the NaN values with a specified placeholder. 
It is implemented by the use of the SimpleImputer() method which takes the following arguments :
 

missing_values : The missing_values placeholder which has to be imputed. By default is NaN 
strategy : The data which will replace the NaN values from the dataset. The strategy argument can take the values - 'mean'(default), 'median', 'most_frequent' and 'constant'. 
fill_value : The constant value to be given to the NaN data using the constant strategy. 
 


Code: Python code illustrating the use of SimpleImputer class.
 

Output 
 

Original Data : 

[[12, nan, 34]
[10, 32, nan]
[nan, 11, 20]]


Imputed Data : 

[[12, 21.5, 34]
[10, 32, 27]
[11, 11, 20]]


Remember: The mean or median is taken along the column of the matrix
 

Comment