VOOZH about

URL: https://www.analyticsvidhya.com/blog/2015/09/selection-techniques-ensemble-modelling/

⇱ Use Of Forward Selection Techniques for Ensemble Modeling


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Reading list

Learn to use Forward Selection Techniques for Ensemble Modeling

Tavish Srivastava Last Updated : 26 Jun, 2020
3 min read

Introduction

Ensemble methods have the ability to provide much needed robustness and accuracy to both supervised and unsupervised problems. Machine learning is going to evolve more and more and computations power becomes cheap and the volume of data continues to increase. In such a scenario, there is a limit to the improvement you can achieve by using a single framework and attempting to improve its predictive power (using modification in variables).

Ensemble Modeling follows the philosophy of β€˜Unity in Strength’ i.e. combination of diversified base models strengthens weak models. The success of ensemble techniques spreads across multiple disciplines like recommendation systems, anomaly detection, stream mining, and web applications where the need for combination of competing models is ubiquitous.

If you wish to experience the powerful nature of ensemble, try using supervised and unsupervised model for a single task and merge their results. You’ll notice that the merger delivered better performance.

Last week, we talked about a simple method to ensemble multiple learners through neural networks. We created a black box, which took in all learners and gave us a final ensemble predictions. In this article, I will take an alternate route(using R) to solve the same problem, with a higher control in the ensemble process. I have leveraged technique discussed in one of the Cornell’s paper β€œEnsemble Selection from Libraries of Models”. The underlying principle remains the same :

Ensemble of diverse and high performance models are better than individual models.

πŸ‘ Forward Selection Techniques for Ensemble Modeling

Principles involved in the Process

Forward Selection of learners : Imagine a scenario where we have 1000 learner outputs. We start with an empty bag and then in every iteration we add a new learner which benefits the bag on performance metric.

Selection with Replacement : To select a new addition for the bag, we put our hand in the stack of 1000 leaner and pull out the best of the lot. Even if a learner is found to be a good fit, we’ll still use this learner in the next iteration stack.

Bagging of Ensemble Models : Ensemble learners are prone to over-fitting. To avoid this, we take a sample to try ensembling. Once we are done, we again use another sample. Finally, we bag all these models together using simple average of predictions or maximum votes.

Understanding the R code

The R code to ensemble multiple learners is not very easy to follow. Hence, I have added steps(explanation) at every line of code for ease of understanding:

Step 1 : Load the train and test files

train <- read.csv("train_combined.csv")
test <- read.csv("test_combined.csv")

Step 2 : Specify basic metrics like number of bags/iterations, number of learners/models

num_models <- 24
itertions <- 1000

Step 3 : Load the library needed for the performance metric (optional)

library(Metrics)

Step 4 : Calculating individual performance of models for establishing benchmarks

rmsle_mat <- matrix(0,num_models,2)
rmsle_mat[,2] <- 1:num_models
for(i in 1:num_models){
rmsle_mat[i,1] <- rmsle(train[,i],train[,num_models+1])
print(rmsle(train[,i],train[,num_models+1]))
}
best_model_no <- rmsle_mat[rmsle_mat[,1] == min(rmsle_mat[,1]),2]
rmsle(train[,best_model_no],train[,num_models+1])

Step 5 : Using all the metrics specified apply forward selection with replacement in 1000 bags

x <- matrix(0,1000,itertions)
prediction_test <- matrix(0,nrow(test),1)
prediction_train <- matrix(0,nrow(train),1)
for (j in 1:itertions){
rmsle_in <- 1
rmsle_new <- matrix(0,num_models,2)
rmsle_new[,2] <- 1:num_models
print(j)
t = 1
set.seed(j*121)
train1 <- train[sample(1:nrow(train), 10000,replace=FALSE),]
for(i in 1:num_models){
rmsle_mat[i,1] <- rmsle(train1[,i],train1[,num_models+1])
}
best_model_no <- rmsle_mat[rmsle_mat[,1] == min(rmsle_mat[,1]),2]
prediction <- train1[,best_model_no]
prediction_1 <- test[,best_model_no]
prediction_2 <- train[,best_model_no]
x[t,j] <- best_model_no
while(-1 < 0) {
t <- t + 1
prediction1 <- prediction
for (i in 1:num_models){
prediction1 <- ((t*prediction) + train1[,i])/(t+1)
rmsle_new[i,1] <- rmsle(prediction1,train1[,num_models+1])
}
rmsle_min <- min(rmsle_new[,1])
model_no <- rmsle_new[rmsle_new[,1] == min(rmsle_new[,1]),2]
if(rmsle_in < rmsle_min) {break} else {
rmsle_in <- rmsle_min
prediction <- (((t-1)*prediction) + train1[,model_no])/t
prediction_1 <- (((t-1)*prediction_1) + test[,model_no])/t
prediction_2 <- (((t-1)*prediction_2) + train[,model_no])/t
x[t,j] <- model_no
print(rmsle_in)
}
}
prediction_test <- cbind(prediction_test,prediction_1)
prediction_train <- cbind(prediction_train,prediction_2)
}

End Notes

Even though bagging tackles majority of over-fitting cases, still it is good to be cautious about over-fitting in ensemble learners. A possible solution is set aside one set of population untouched and try performance metrics using this untouched test population. The two methods mentioned are no way exhaustive list of possible ensemble techniques. Ensemble is more of an art than science. Most of the master kagglers are masters of this art.

Did you find this article useful? Have you tried anything else to find optimal weights? I’ll be happy to hear from you in the comments section below.

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Login to continue reading and enjoy expert-curated content.

Free Courses

AI Interview Questions & Answers Masterclass

Master AI interview questions with expert answers.

Building and Evaluating RAG System

Learn to build RAG system applications, create AI agents, and deploy.

Build Products 10x Faster with GenAI : Hands On

Master prompt engineering,build AI apps with LangChain & deploy custom GPTs.

Model Deployment using FastAPI; Prepare, Train, and Test FastAPI Application

Deploy a fastapi machine learning model with XGBoost and Docker APIs.

Build Data Pipelines with Apache Airflow

Learn ETL pipeline building and workflow orchestration with Airflow.

Responses From Readers

snehil mishra

Hi kunal sir, i'm a diploma holder in electronics engineering and also have 2.5+years work experience in mobile service industry. I have completed my BCA this year in distance mode. i'm working professional right now i want to be a business analytics in multiple domain. I want to persue MBA,but i'm confused that should i go to regular mode or from distance mode and also the specilization. thanks snehil mishra

123 1

Hi Snehil, Please put such generic posts on the forum. That ways you will also get more opinions on your questions for industry leaders. Also that will help us keep this arena of article zone, specific enough. Thanks for following us, Tavish

123 456

Hi, thanks for the post can you give us the input dataset

123 1

Hi sel, The post is not specific to any particular dataset. In case you need to experiment this code, you can download a few datasets from Kaggle. Tavish

123 456
Monit Gehloy

I don't understand the step 4? What is its need? What models are being talked about when we have not created any model as of yet? And what is it doing with columns of the train set to determine rmsle?

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
πŸ‘ Av Logo White

Continue your learning for FREE

Forgot your password?
πŸ‘ Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

πŸ‘ Popup Banner
πŸ‘ AI Popup Banner