VOOZH about

URL: https://www.analyticsvidhya.com/blog/2018/01/microsoft-drawing-bot-ai/

⇱ Microsoft's New AI Bot can Draw Images Based on Captions


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Microsoft’s New AI Bot can Draw Images Based on Captions

Pranav Dar Last Updated : 20 Jan, 2018
2 min read

Microsoft has built an AI powered bot that can draw images based on the text it is provided. The below image, published by Microsoft, depicts a yellow black bird that was completely generated by the bot.

👁 Image

Source: Microsoft

Microsoft is simply calling this new technology the “drawing bot” for now. It can generate images from animals to scenic hillsides, and even outlandish things like flying cars and twisted street lamps. It’s basically the AI version of pictionary where you’re supposed to draw something based on cue cards. The only difference is you type something for the bot, and it will run it’s algorithm and give you the image.

The most exciting part about the technology is that the images geenrated might not even be of actual real things. The bird created in the above image? It might not even be in existence – they’re just a rendering of the machine’s imagination of how a bird looks like. Further, each image that is created contains other details that are not provided in the text descriptions.

In terms of where this bot will be used once it’s made available, Microsoft see it being used by painters and interior decorators. It can also be used a voice-activated tool for creating or refining photos (maybe there’s a role for Cortana in there).

To make the AI understand what words go with which pictures, the drawing bot was trained on pairs of images and captions. The algorithm is divided into two parts:

  • GAN – Generative Adversarial Network, it generates images based on the text
  • Discriminator – this judges the quality of the generated image

Microsoft has previously released the CaptionBot, which takes images as input and writes captions for them. They followed this up with the SeeingAI tool. Again, it takes images as input and describes what’s in them. This is especially targeted towards low-vision and blind people.

Our take on this

While Google launched a similar AI last year which could create doodles, Microsoft’s version is in a different league altogether. It’s not perfect yet, but one can imagine the future uses for such technology. The principal researcher in this matter, Xiaodong He, thinks it might even be used to create animated movies (using pre-written scripts). Following Google’s AutoML Vision launch yesterday, 2018 is already promising to be a big year in the image recognition field.

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
👁 Av Logo White

Continue your learning for FREE

Forgot your password?
👁 Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

👁 Popup Banner
👁 AI Popup Banner