India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

Reading list

How to Become a Data Analyst in 2025: A Complete RoadMap

A Comprehensive Learning Path to Tableau in 2025

A Comprehensive NLP Learning Path 2025

Learning Path to Become a Data Scientist in 2025

Step-by-Step Roadmap to Become a Data Engineer in 2025

A Comprehensive MLOps Learning Path: 2025 Edition

Roadmap to Become an AI Engineer in 2025

A Comprehensive Learning Path to Master Computer Vision in 2025

Best Roadmap to Learn Generative AI in 2025

GenAI Roadmap for Enterprises

Large Language Models Demystified: A Beginner’s Roadmap

Learning Path to Become a Prompt Engineering Specialist

NVIDIA Open Sourced a Video-to-Video Translation Technique using PyTorch – and it is Super Impressive

👁 Pranav Dar

Pranav Dar Last Updated : 07 May, 2019

3 min read

Overview

Researchers from NVIDIA have pioneered a novel approach that does video-to-video translation
They have released a PyTorch implementation of the technique on GitHub
The PyTorch code can be used for multiple scenarios, including generating human bodies from given poses!

Introduction

Progress in the field of deep learning and reinforcement learning relies on our capability to recreate the dynamics of real-world scenarios in a simulation environment. I have previously written about an algorithm that transforms images into a completely different category, and another technique that fixes corrupt images in the blink of an eye. Progress, at least in the image processing field, has been constant and promising.

But research in the area of video processing has been painstakingly difficult. For example, can you take a video sequence and predict what will happen in the next frame? It’s been explored, but not to any great avail. At least until now.

👁 Image

NVIDIA, already leading the way in using deep learning for image and video processing, has open sourced a technique that does video-to-video translation with impressive results. The goal of this research, as described by the researchers in their paper, is to learn a mapping function from a given input video in order to produce an output video which depicts the contents of the input video with incredible precision (as you can see in the above GIF).

They have released the code on GitHub, which is a PyTorch implementation of the technique for a high resolution translation of videos. This code can currently be used for:

Converting semantic labels into realistic real-world videos
Creating multiple outputs for synthesizing people talking from edge maps
Generating a human body from a given pose (not just the structure, but the entire body!)

👁 Image

The above image is a wonderful illustration of different models (or techniques) used to perform the same task. On the top left is the input source video. Adjacent to that is the pix2pixHD model, the state-of-the-art image-to-image translation approach. On the bottom left is the COVST model and on the bottom right is NVIDIA’s vid2vid technique.

You can browse through the below links to read more about this novel technique and even implement it on your own machine:

GitHub Repository (with Python code)
Multiple videos demonstrating this technique
Research paper (Work in progress – the final version is expected to be released imminently)

Also, be sure to check out the below video which encapsulates all that the open sourced PyTorch code can do:

Our take on this

If you were impressed with our last NVIDIA article on converting a standard video into slow-motion, this latest research will leave you stunned. And it’s not just limited to recreating real-world scenarios, it can even predict what will happen in the next few frames! When compared to baseline models like PredNet and MCNet, the vid2vid model produced far superior results.

There are still a few issues with the model like not being able to map a turning car, but these will be overcome in due course. If this field of research interests you, go through the research paper I linked above and also download the PyTorch code and try to replicate the technique on your own end.

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

👁 Pranav Dar

Pranav Dar

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

AVbytes

Login to continue reading and enjoy expert-curated content.

Free Courses

👁 Generative AI
4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

👁 Generative AI
4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

👁 Generative AI
4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

👁 Generative AI
4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

👁 Generative AI
4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

Cancel reply

Become an Author

Share insights, grow your voice, and inspire the data community.

Reach a Global Audience
Share Your Expertise with the World
Build Your Brand & Audience

Join a Thriving AI Community
Level Up Your AI Game
Expand Your Influence in Genrative AI

👁 imag

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

👁 Av Logo White

Continue your learning for FREE

👁 Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

👁 Popup Banner

👁 AI Popup Banner

URL: https://www.analyticsvidhya.com/blog/2018/08/nvidia-open-sourced-video-to-video-translation-pytorch/

⇱ NVIDIA has Open Sourced an Impressive Video to Video Translation Technique

Reading list

NVIDIA Open Sourced a Video-to-Video Translation Technique using PyTorch – and it is Super Impressive

Overview

Introduction

Our take on this

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Continue your learning for FREE

Enter email address to continue

Enter OTP sent to

Enter the OTP