![]() |
VOOZH | about |
In this article let's learn how to use the make_pipeline method of SKlearn using Python.
The make_pipeline() method is used to Create a Pipeline using the provided estimators. This is a shortcut for the Pipeline constructor identifying the estimators is neither required nor allowed. Instead, their names will automatically be converted to lowercase according to their type. when we want to perform operations step by step on data, we can make a pipeline of all the estimators in sequence.
Syntax: make_pipeline()
parameters:
- stepslist of Estimator objects: The chained scikit-learn estimators are listed below.
- memorystr or object with the joblib.Memory interface, default=None: used to store the pipeline's installed transformers. No caching is done by default. The path to the cache directory is specified if a string is provided. A copy of the transformers is made before they are fitted when caching is enabled. As a result, it is impossible to directly inspect the transformer instance that the pipeline was given. For a pipeline's estimators, use the named steps or steps attribute. When fitting takes a while, it is useful to cache the transformers.
- verbosebool, default=False: If True, each step's completion time will be printed after it has taken its required amount of time.
returns:
p: Pipeline: A pipeline object is returned.
This example starts with importing the necessary packages. 'diabetes.csv' file is imported. Feature variables X and y where X variables represent a set of independent features and 'y' represents a dependent variable. train_test_split() is used to split X and y variables into train and test sets. test_size is 0.3, which means 30% of data is test data. make_pipeline() method is used to create a pipeline where there's a standard scaler and logistic regression model. First, the standard scaler gets executed and then the logistic regression model. fit() method is used to fit the data in the pipe and predict() method is used to carry out predictions on the test set. accuracy_score() metric is used to find the accuracy score of the logistic regression model.
To read and download the dataset click here.
Output:
accuracy score : 0.7878787878787878