VOOZH about

URL: https://www.analyticsvidhya.com/blog/2022/09/stable-diffusion-ai-has-taken-the-world-by-storm/

⇱ Stable Diffusion AI has Taken the World By Storm


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Stable Diffusion AI has Taken the World By Storm

krupesh Last Updated : 05 Sep, 2022
7 min read

This article was published as a part of the Data Science Blogathon.

See that image above?

Amazing, isn’t it?

Looks like an oil painting in melancholic postcard style, right? You bet!

I used AI and some of the same words from the previous sentence to create it in a few minutes.

And we will see how in this article!

 

Introduction

If you are someone who has been keeping track of the latest trends in text-to-image AI, you have surely heard of Stable Diffusion.

It is the latest cutting-edge model, similar to Dall-E 2 and Mid journey.

The interesting part about Stable Diffusion, unlike the other tools, is — it is open-sourced!

This means anyone can use it for free or even create applications from it!

As we speak, people are already joining in using the Google Colab notebooks.

Let us see what sound diffusion is, the steps on how to create your first notebook and your AI-generated images, and see some stunning images I generated using it (along with the prompts)!

Stable Diffusion AI

Stable Diffusion is a text-to-image model created by a collaboration between engineers and researchers from  CompVis, Stability AI, and LAION.

It is based on a model called Latent Diffusion (High-Resolution Image Synthesis with Latent Diffusion Models). The theoretical details are beyond the scope of this article. Perhaps I will cover them in another post.

We need to know that it is one of the most anticipated models to be released recently!

Why?

Because the generated images are a close contest to Open AIs Dall-E 2, perhaps even better, and only a few days back, they open-sourced it, with the code and all the model weights.

That means you can use it for free!

Stable Diffusion Public Release — Stability.Ai

So let’s see how to use it!

How to use Stable Diffusion AI ?

There are a few simple steps to set up.

Step 1: Sign-up or log in

Sign up or log in to Hugging Face: Hugging Face – The AI community building the future. They create the model you are about to use, and they have given access to their Google Colab notebook.

Once you sign up, accept the terms and conditions of using the model.

Step 2: Getting the access token

Now that you are logged in, you need to get the access token you will use later. For this, go to the settings in the account options in the top right corner.

In settings, you will see the option called Access Tokens. Create a new token.

Give your token a suitable name and select the read role.

Copy the generated token to the clipboard; you will need it later in the Jupyter notebook.

Step 3: The notebook

Here you will need to log in to your Google Colab. Once there, access the Stable Diffusion notebook from this link:

Google Colab

Your PC may not have enough capacity to run this model, so connecting it to the hosted runtime is better.

Once it s connected, you are ready to go! Go to the runtime option in the ribbon and Run all cells. You may get a warning saying Google does not author the notebook. Simply click yes and go ahead.

When you reach the 4th cell, you will get the option to enter the access token.

Enter the access token you copied in the previous step. Wait for the Login Successful message.

That is it! You are ready to use the model! The notebook is pretty self-explanatory so that you can try various prompts directly.

Here is the first example they have only with 4 lines of code and the prompt “a photograph of an astronaut riding a horse“.

from torch import autocast
prompt = “a photograph of an astronaut riding a horse.”
with autocast(“cuda”):
image = pipe(prompt)[“sample”][0] # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)
# Now, to display an image, you can either save it such as:
image.save(f”astronaut_rides_horse.png”)
# or if you’re in a google colab, you can directly display it with
image

This gives the resulting image:

Amazing, isn’t it?

From here on, you are only limited by your imagination. Try various combinations of prompts, art styles, and even artist names. AI can generate them well!

That brings me to the next part.

Be Creative

Even though everything seems simple until now, you will soon understand when trying to give detailed prompts that it is difficult and time-consuming, to say the least!

You can never arrive at a perfect prompt without much trial and error. The AI will not generate the same images with the same prompt each time.

Additionally, you need to know a lot about the art world, the artists, and the different styles used to be able to give good prompts.

True art is knowing how to tell the computer to draw!

Fortunately, there are a host of resources to understand that. There is an entire database of prompts called Lexica, where you can access the community-generated images and their prompts. Here is the link: Lexica

spent hours trying several prompts and observed the corresponding generated images. Here are a  few insights that I learned:

1. Start from the overall idea and then drill down further into the details. For example, you can start the prompt as “a black table in the middle of a grass field” and then give further details like table shape, grass colour, weather, etc.

2. Use several adjectives like intricate, surreal, and otherworldly, to intensify the image details

3. You can give style terms like baroque, medieval style, Deviantart, or even artists like Leonardo da Vinci, Picasso, etc. The AI can understand and even cross-pollinate several styles and artists!

4. Give details such as 4K, 8K image quality, HD, and so on for a detailed image.

5. Specify lighting conditions like natural lighting, dark atmosphere, dystopian sky, etc., to give a greater feel to the image.

6. Be sure not to give adult-oriented or NSFW prompts; in this case, it will flag your content as NSFW and generate a black image.

I generated some images by creating prompts from what I learned (the author generates all images below).

A concept car in a futuristic city

“Concept art by Greg Rutkowski, futuristic car, high tech city, scary, dystopian atmosphere, frightening and creepy, sci-fi, highly detailed, digital painting, art stations, concept art, smooth, sharp focus illustration, artstation HQ”

Ancient Indian University of Nalanda

“Low angle wide shot, the ancient city of Nalanda, library, oasis, temple covered with moss, meadows, sub-tropical, illustration, summer, night time, starry sky, perfectly shaded, soft painting, art by Krenz Cushart and Wenjun lin”

A complex mechanical contraption

“Mechanical concept sketch, Grandfather clock, intricate mechanism, complex dial with the planetary system, Mitchell mohrhauser, 8 k photorealistic, cinematic lighting, hd, high details, dramatic, dark atmosphere, trending on artstation”

A horrifying alien landscape

“fantastic, horrifying exoplanet landscape hyperdetailed 3 d matte painting photo art by Moebius by Zdzisław Beksinski by Jakub Rozalski by Nekro”

Conclusion

I hope you got a lot of material to learn from this post. To summarise, this article shows:

  • An overview of the powerful text-to-image generator Stable Diffusion.
  • The basics of how it works.
  • The practical steps on how you can set up the Jupyter notebook and generate images yourself.
  • The importance of creating prompts, along with some practical examples!

As concluding thoughts, there are a few important takeaways from using text-to-image generators:

  • Text-to-image AI has leapfrogged in the technology world. Whether it is Stable Diffusion, Dall-E 2, or Mid journey, I found them quick in generating awesome images from the prompts!
  • From surreal landscapes or awe-inspiring characters – nothing is out of bounds of its ability to generate images. These can be used for inspiration in video games and even in movies.
  • Even then, generating images is no easy task. One needs to understand how to get the prompts right.
  • The images are not perfect, especially when rendering text or people’s hands. This will improve over time as more training data becomes available.
  • Most of the AI image generators available out there are very expensive!

Thank you for reading till the end! I hope you loved the article on stable diffusion AI as much as I loved creating it for you.

So, do you like the images?

Are you inspired now to try your hand at image-generating AI? Would you like more details about how such an AI works?

I would love to hear your views!

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
👁 Av Logo White

Continue your learning for FREE

Forgot your password?
👁 Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

👁 Popup Banner
👁 AI Popup Banner