VOOZH about

URL: https://towardsdatascience.com/scikit-learn-vs-sklearn-6944b9dc1736/

⇱ Is There Any Difference Between Scikit-Learn and Sklearn? | Towards Data Science


Is There Any Difference Between Scikit-Learn and Sklearn?

scikit-learn vs sklearn in Python

3 min read
👁 Photo by Anastasia Zhenina on Unsplash
Photo by Anastasia Zhenina on Unsplash

Introduction

scikit-learn is definitely one of the most commonly used packages when it comes to Machine Learning and Python. However, a lot of newcomers get confused about the naming of the package itself due to the fact that it looks to appear with two distinct names; scikit-learn and sklearn.

In today’s short article, we will discuss whether there’s any difference between the two packages in the first place. Additionally, we’ll discuss whether it matters which one you install and import in your source code.


Subscribe to Data Pipeline, a newsletter dedicated to Data Engineering


What is scikit-learn

The project was originally started back in 2007 as part of the Google Summer of Code while the first public release was made in early 2010.

scikit-learn is an open source Machine Learning Python package that offers functionality supporting supervised and unsupervised learning. Additionally, it provides tools for model development, selection and evaluation as well as many other utilities including data pre-processing functionality.

More specifically, scikit-learn’s main functionality includes classification, regression, clustering, dimensionality reduction, model selection and pre-processing. sThe library is very simple to use and most importantly efficient as it is built on NumPy, SciPy and matplotlib.


Is there any difference between scikit-learn and sklearn?

The short answer is no. scikit-learn and sklearn both refer to the same package however, there are a couple of things you need to be aware of.

Firstly, you can install the package by using either of scikit-learn or sklearn identifiers however, it is recommended to install scikit-learn through pip using the skikit-learn identifier.

If you install the package using the sklearn identifier and then run pip list you will notice the annoying sklearn 0.0 entry:

$ pip install sklearn
$ pip list
Package Version
------------- -------
joblib 1.0.1
numpy 1.21.2
pip 19.2.3
scikit-learn 0.24.2
scipy 1.7.1
setuptools 41.2.0
sklearn 0.0
threadpoolctl 2.2.0

Additionally, if you now attempt to uninstall sklearn, the package won’t be uninstalled:

$ pip uninstall sklearn
$ pip list
Package Version
------------- -------
joblib 1.0.1
numpy 1.21.2
pip 19.2.3
scikit-learn 0.24.2
scipy 1.7.1
setuptools 41.2.0
threadpoolctl 2.2.0

Essentially, sklearn is a dummy project on PyPi that will in turn install scikit-learn. Therefore, if you uninstall sklearn you are just uninstalling the dummy package, and not the actual package itself.


Now despite how you installed scikit-learn, you must import it in your code using the sklearn identifier:

import sklearn

If you attempt to import the package using the scikit-learn identifier, you will end up with a SyntaxError:

>>> import sklearn
>>> import scikit-learn
File "<stdin>", line 1
import scikit-learn
 ^
SyntaxError: invalid syntax

Even if you try to import it with __import__() in order to deal with the hyphen in the package’s name, you will still get a ModuleNotFoundError:

>>> __import__('scikit-learn')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'scikit-learn'

Final Thoughts

In today’s short article, we attempted to shed some light around scikit-learn and sklearn since a lot of beginners seem to be confused about which term to use when developing ML functionality in Python.

In general, you are advised to install the library using the scikit-learn identifier (i.e. pip install scikit-learn) but in your source code, you must import it using the sklearn identifier (i.e. import sklearn).


Subscribe to Data Pipeline, a newsletter dedicated to Data Engineering


You may also like

What Is Machine Learning


How to Split a Dataset Into Training and Testing Sets with Python


fit() vs predict() vs fit_predict() in Python scikit-learn


Written By

Giorgos Myrianthous

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles