![]() |
VOOZH | about |
Feature engineering has been at the core of any hackathon winning solution. It has become the defacto go-to option when youβre looking to differentiate your solution from the competition. But itβs often difficult to engineer new features from the dataset youβve been given. Itβs a time (and energy) consuming process.
This is where the tool set from Feature Labs comes into play. Developed by the folks at Feature Labs, βFeaturetoolsβ is an open-source framework for automating feature engineering.
The company has developed this by using a process called Deep Feature Synthesis (DFS). According to Feature Labs CEO, Max Kanter, DFS creates features from raw relational and transactional datasets, like visits to a website or abandoned cart items, and automatically understands and converts that into a predictive signal. The above image gives you a general idea of how the tool works.
It can be integrated into both python 2 and 3. It has been designed to work with common frameworks like Pandas for data preparation and skikit-learn for machine learning.
According to their official website, the tool was βtested against 1000 data scientists in three world wide competitions. On average, Feature Labs performed as well as as well as top human competitors and only required 1/10th of the timeβ.
Early customers of the company include Spanish bank BBVA and developers at MIT. In fact, theyβve published a case study on how BBVA used Featuretools to create a credit card fraud detection system. You can view it here.
Feature engineering is one of the mose important steps in any machine learning pipeline. Whether itβs differentiating your ML algorithm in a hackathon, or creating features to mine the most out of your data as an organization, itβs a critical technique.
This release will not only save a lot of time for the user (or company), it will enable them to shift their focus to other areas of the data science life cycle. The fact that itβs available for python and can be used with common frameworks is a huge plus.
Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.
GPT-4 vs. Llama 3.1 β Which Model is Better?
Llama-3.1-Storm-8B: The 8B LLM Powerhouse Surpa...
A Comprehensive Guide to Building Agentic RAG S...
Top 10 Machine Learning Algorithms in 2026
45 Questions to Test a Data Scientist on Basics...
90+ Python Interview Questions and Answers (202...
8 Easy Ways to Access ChatGPT for Free
Prompt Engineering: Definition, Examples, Tips ...
What is LangChain?
What is Retrieval-Augmented Generation (RAG)?
This is truly awesome. Will save a whole lot of time, but will be interesting to see its practical implementation. Has it been released already?
Hi Fawad, Yes it's available on Feature Lab's website (link is in the article above).
Can you guys write a demo post on this? Iβve gone through their examples on git but am looking for more information.
Edit
Resend OTP
Resend OTP in 45s