VOOZH about

URL: https://www.analyticsvidhya.com/blog/2018/04/sequence-modelling-an-introduction-with-practical-use-cases/

โ‡ฑ Sequence Modelling | Sequence Generators


India's Most Futuristic AI Conference Is Back โ€“ Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Reading list

A Must-Read Introduction to Sequence Modelling (with use cases)

Tavish Srivastava Last Updated : 31 May, 2020
8 min read

Introduction

Artificial Neural Networks (ANN) were supposed to replicate the architecture of the human brain, yet till about a decade ago, the only common feature between ANN and our brain was the nomenclature of their entities (for instance โ€“ neuron). These neural networks were almost useless as they had very low predictive power and less number of practical applications.

But thanks to the rapid advancement in technology in the last decade, we have seen the gap being bridged to the extent that these ANN architectures have become extremely useful across industries.

๐Ÿ‘ Image

In this article, we will look at the two main advances in the field of artificial neural networks that have made these ANNs more like the human brain,

Table of Contents

  1. Two Main Advances in the Field of ANN
  2. Thought Experiment
  3. Practical Applications of Sequence Modelling
  4. Sequence Generators
  5. Sequence to Sequence NLP Models
  6. Few More Sequence to Sequence Models that go beyond text

Two Main Advances in the Field of ANN

  1. GPUs have immensely improved our computational power that now enables us to vastly increase the depth and breadth of neurons. However, we are still far away from reaching the number of neurons our brain has.
  2. ANN can now process sequence data in both input and output nodes. This is how our brain works. Our brain does not solve binary classification to understand complex ideas. We formulate โ€œThoughtsโ€ based on a sequence of information given to us and then our brain expresses this โ€œThoughtโ€ in understandable sequence of words.

Can we introduce this concept of โ€œThoughtโ€ in an ANN? The answer is yes, and we will explore more about the idea in this article.

๐Ÿ‘ Image

Sequence models have garnered a lot of attention because most of the data in the current world is in the form of sequences โ€“ it can be a number sequence, image pixel sequence, a video frame sequence or an audio sequence.

Over the last 10 years, we have stored 1000s of Petabytes (or more than 10 ^ 9 GBs) of unstructured sequence data for absolutely no reason as we had no way to fetch information out of such data formats.  Luckily, we now have this new family of neural network architectures called sequence models that can turn this data dump into GOLD MINES.

The scope of this article is not to talk about all the complex mathematics that goes behind the scene in Sequence Modelling or give you some sample codes to run on sequence modelling (I will park that for some later articles), but to give you practical examples of sequence modelling implementations in the industry. These will enable you to identify business problems in your industry that might need this special tool.

To get a better understanding of what this article is about, below is a scenario which I want you to imagine. Put your analytical thinking hats on!

Thought Experiment

Walmrt has appointed you as the head of itโ€™s new vertical โ€“ WalKiosk. The company wants you to lead the development of a self servicing (human-less) store where a customer will only interact with Walmrtโ€™s Kiosk, which is very similar to a vending machine. They want to install this Kiosk in various locations across the United States.

A key difference between this Kiosk and a normal vending machine is that the Kioskโ€™s display does not show the list of items, but simply an audio enabled Google-like search tab. The customer can literally walk up to these Kiosks, and say or type anything after the keyword โ€œOK Walmrt, xxxxxxโ€. Here is a sample interaction (try to evaluate if a human can do a better job than this Kiosk):

Customer says  โ€“ โ€œOK Walmrt, I want the shoes which Leonardo DiCaprio wore in the 1st scene of the 1st movie he did with Nolanโ€ in any possible spoken language.

The idea is for the Kiosk to do a quick search and if it finds a convincing answer, it should reply, in the same language as the customerโ€™s query, something like โ€“ โ€œLeonardo DiCaprio wore black colored Nike shoes of model xxxxx. Click the link on the kiosk to watch a video cut of the scene you asked me to look at. Great news โ€“ we currently have the exact same shoe with the same size as you are wearing, and itโ€™s cost is $200. As you are a loyal customer of Walmrt, I have found a steal deal for you! The new price of the shoe, if you buy it immediately, is $150 for youโ€.

If the customer says โ€œI want to buy itโ€, the Kiosk dispenses the shoe once the customer makes the payment.

Kiosk finally replies โ€“ โ€œThanks Mr. XYZ for shopping with us today. Please give your valuable feedback for us to improve our service further.โ€ Customer writes or says the feedback of this transaction and leaves.

This simple transaction, that will probably take a good chunk of your time in todayโ€™s world, will be resolved in less than 2 minutes (if everything works, that is).

Sounds futuristic? Hereโ€™s a spoiler โ€“ all the fancy next gen functional skills you need to build in this Kiosk will be done mainly by a single architecture โ€“ sequence modelling. Here is a small list of tasks the Kiosk needs to do:

  1. Speech Recognition to understand what the customer is saying
  2. Machine Language Translation from source language to a known language (say English)
  3. Name entity/Subject extraction to find the main subject of the customerโ€™s query translated in step 2
  4. Relation Classification to tag relationships between various entities tagged in step 3
  5. Path Query Answering (Similar to Google search) on entity-relationship found in step 3 & 4 using core knowledge graph
  6. Speech Generation to generate answers for the customer with all the relevant information found in step 5
  7. Chatbot skill to have conversational ability and engage with customers just like a human
  8. Text Summarization of customer feedback to work on key challenges/pain points
  9. Product Sales Forecasting to replenish stock

๐Ÿ‘ Image

The skills required to create WalKiosk are not limited to these nine steps, but they are good enough to bring out the core idea. Each of these nine skills can be modeled by a single architecture โ€“ Sequence Modelling (but you already knew this).

You can imagine sequence modelling as a black box which stays almost the same; all you need to change is the input and target data for each of the nine skill sets. Leveraging the idea that all the model architectures in each step is the same, we can take this a step further and create a single model that takes input in any language and completes the self service process/reporting process/inventory management process all together.

If this was not enough to make you Google all about sequence modelling, letโ€™s look at an exhaustive list of all functions sequence modelling is capable of.

Practical Applications of Sequence Modelling

To make sure we cover most of the possible applications of sequence modelling, we will categorize them based on the type of input and output sequences. Inputs and outputs can be one of the following: Scalar, Trend, Text,  Image, Audio or Video. If each of these six can be both input and output, we have 36 categories in total. However, not each of these pairs has been explored in depth yet.

Before moving to the list, pause for a moment and create your own list of applications (you can use our thought experiment as a reference).

Here goes the list:

๐Ÿ‘ Image

Reading the table is fairly straight forward:

  • Type is the category of input/target
  • Elements are the number of elements in input/target series
  • Use Cases are the possible applications in the category

We will review a few of these use cases in order to get a grasp of the superpowers that our sequence model possess.

First, letโ€™s talk about the easiest of the lot โ€“ Sequence Generators

These generators generally take scalar inputs. The scalar input can be any random seed/number. Following are a few examples of generators:

๐Ÿ‘ Image

Note that we can train our model on any specific type of data. For instance, if we train our text generator on a Harry Potter book, it is highly likely that you will get a text which is full of imagination/magic with the main character as Harry Potter. If you were lucky, you might get a chapter that makes sense and you can enjoy this privileged chapter that no one has access to!

Another example โ€“ if you train the model on Jazz music, you can create new songs in the same genre using this model. Yet another example โ€“ if you train the model on images of animals, you might see how cross breeds might look like.

Next, letโ€™s talk about the favorites โ€“ Sequence to sequence NLP Models

๐Ÿ‘ Image

Machine Language Translation has reached new heights and is now competing strongly with human translators. Today, you can find real-time translating machines which are based on the core concept of sequence to sequence models.

Text summarization is another important use case of sequence models. Text summarization can significantly reduce the task of manually reading lengthy customer complaints, monitoring compliance based call/chat monitoring, and reviewing customer feedback on product etc.

Chatbot is yet another important application and is now being widely used in Operations/Call Centers/Chat Centers/Personal assistants like Siri/Google Home/Alexa.

Finally, we will talk about a few more sequence to sequence models that go beyond text

๐Ÿ‘ Image

Speech recognition is currently the category which has absorbed the maximum investment in terms of money. Speech recognition is extremely important in tools like personal AI assistants (Alexa, Google Home, etc.) and call center speech recording tools.

Currently we have billion dollar companies whose sole competency is speech recognition. Speech recognition also uses sequence to sequence models extensively. Image Captioning is one of the hottest research fields which has a wide application in the social media industry. Subtitle generation has not reached the stage of production yet, but is being actively researched.

End Notes

A lot of the data science talent today focuses its effort on solving problems that already exist. An equally important task, for any successful data scientist or analyst, is to identify and create new tasks that can be solved analytically. The latter is a very different exercise and does not need a lot of coding experience or mathematically background. All you need to know is what is possible and what is not, using a given tool.

Problem identification is a skill set that is a โ€œmustโ€ for any senior analytics professional. I hope this introductory article on sequence learning gave you strong motivation to start searching for new problems in your industry that can be solved using this method.

If you have any ideas or suggestions regarding the topic, do let me know in the comments below!

Learnengage , compete and get hired!

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Login to continue reading and enjoy expert-curated content.

Free Courses

Build a Document Retriever Search Engine with LangChain

โ€‹Learn to create a document retrieval search engine using LangChain. โ€‹

Coding a ChatGPT-style Language Model From Scratch in Pytorch

Build a ChatGPT-style language model using PyTorch.

Ensemble Learning and Ensemble Learning Techniques

Learn ensemble learning, its techniques, and how it works in this course!

Bagging and Boosting ML Algorithms

Explore Bagging and Boosting to understand advanced ML algorithms.

Naive Bayes from Scratch

Master Naรฏve Bayes for ML: Build classifiers, analyze data, and apply Bayes.

Responses From Readers

Thank you. Please post simple chatbot model (train+use) implementation using tensorflow in python.

123 1
Aishwarya Singh

Hi Ramprasad, You can follow this link for TensorFlow's seq2seq model.

123 456

Greetings!!, Thanks a ton for sharing the insights I liked the idea of not reinventing the models when we already have solutions to most of the problems is good point to start with when we are starting the journey in Data Science. I am currently working on converting free text to a cat log or bucket them into categories . Is there a way that you can help with my use case Would appreciate your help

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
๐Ÿ‘ Av Logo White

Continue your learning for FREE

Forgot your password?
๐Ÿ‘ Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

๐Ÿ‘ Popup Banner
๐Ÿ‘ AI Popup Banner