VOOZH about

URL: https://www.analyticsvidhya.com/blog/2018/12/best-machine-learning-github-repositories-reddit-threads-november-2018/

⇱ Top Machine Learning GitHub Repositories from November 2018


India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

  • d
  • :
  • h
  • :
  • m
  • :
  • s

Reading list

5 Best Machine Learning GitHub Repositories & Reddit Discussions (November 2018)

Pranav Dar Last Updated : 06 May, 2019
6 min read

Introduction

Coding is among one of the best things about being a data scientist. There are often days when I find myself immersed in programming something from scratch. That exhilarating feeling you get when you see your hard work culminate in a successful model? Exhilarating and unparalleled!

But as a data scientist (or a programmer), its equally important to create checkpoints of your code at various intervals. It’s incredibly helpful to know where you started off from last time so if you have to rollback your code or simply branch out to a different path, there’s always a fallback option. And that’s why GitHub is such an excellent platform.

The previous posts in this monthly series have expounded on why every data scientist should have an active GitHub account. Whether it’s for collaboration, resume/portfolio, or educational purposes, it’s simply the best place to enhance your coding skills and knowledge.

👁 Image

And now let’s get to the core of our article – machine learning code! I have picked out some really interesting repositories which I feel every data scientist should try out on their own.

Apart from coding, there are tons of aspects associated with being a data scientist. We need to be aware of all the latest developments in the community, what other machine learning professionals and thought leaders are talking about, what are the moral implications of working on a controversial project, etc. That is what I aim to bring out in the Reddit discussion threads I showcase every month.

To make things easier for you, here’s the entire collection so far of the top GitHub repositories and Reddit discussions (from April onwards) we have covered each month:

GitHub Repositories

👁 Image

Open AI’s Deep Reinforcement Learning Resource

👁 Image

Keeping our run going of including reinforcement learning resources in this series, here’s one of the best so far – OpenAI’s Spinning Up! This is an educational resource open sourced with the aim of making it easier to learn deep RL. Given how complex it can appear to most folks, this is quite a welcome repository.

The repo contains a few handy resources:

  • An introduction to RL terminology, kinds of algorithms, and basic theory
  • An essay about how to grow into an RL research role
  • A curated list of important papers organized by topic
  • A code repo of short, standalone implementations of key algorithms
  • A few exercises to get your hands dirty

NVIDIA’s WaveGlow

👁 Image

This one is for all the audio/speech processing people out there. WaveGlow is a flow-based generative network for speech synthesis. In other words, it’s a network (yes, a single network!) that can generate impressive high quality speech from mel-spectrograms.

This repo contains the PyTorch implementation of WaveGlow and a pre-trained model to get you started. It’s a really cool framework, and you can check out the below links as well if you wish to delve deeper:

BERT as a Service

👁 Image

We covered the PyTorch implmentation of BERT in last month’s article, and here’s a different take on it. For those who are new to BERT, it stands for Bidirectional Encoder Representations from Transformers. It’s basically a method for pre-training language representations.

BERT has set the NLP world ablaze with it’s results, and the folks at Google have been kind enough to release quite a few pre-trained models to get you on your way.

This repository “uses BERT as the sentence encoder and hosts it as a service via ZeroMQ, allowing you to map sentences into fixed-length representations in just two lines of code”. It’s easy to use, extremely quick, and scales smoothly. Try it out!

Python Implementation of Google’s ‘Quick Draw’ Game

👁 Image

Quick, Draw is a popular online game developed by Google where a neural network tries to guess what you’re drawing. The neural network learns from each drawing, hence increasing it’s already impressive ability to correctly guess the doodle. The developers have built up a HUGE dataset from the amount of drawings users have made previously. It’s an open-source dataset which you can check out here.

And now you can build your own Quick, Draw game in Python with this repository. There is a step-by-step explanation of how to do this. Using this code, you can run an app to either draw in front of the computer’s webcam, or on a canvas.

Visualizing and Understanding GANs

👁 Image

GAN Dissection, pioneered by researchers at MIT’s Computer Science & Artificial Intelligence Laboratory, is a unique way of visualizing and understanding the neurons of Generative Adversarial Networks (GANs). But it isn’t just limited to that – the researchers have also created GANPaint to showcase how GAN Dissection works.

This helps you explore what a particular GAN model has learned by inspecting and manipulating it’s internal neurons. Check out the research paper here and the below video demo, and then head straight to the GitHub repository to dive straight into the code!

Reddit Discussions

👁 Image

Why Gradient Descent is Even Needed in the First Place

Has this question ever crossed your mind while learning basic machine learning concepts? This is one of the fundamental algorithms we come across in our initial learning days and has proven to be quite effective in ML competitions as well. But once you start going through this thread, prepare to seriously question what you’ve studied previously.

What started off as a straight forward question turned into a full-blown discussion among the top minds on Reddit. I thoroughly enjoyed browsing through the comments and I’m sure anyone with interest in this field (and mathematical rigour) will find it useful.

Reverse Engineering a Massive Neural Network

What do you do when the developer of a complex and massive neural network vanishes without leaving behind the documentation needed to understand it? This isn’t a fictional plot, but a rather common situation the original poster of the thread found himself in.

It’s a situation that happens regularly with developers but takes on a whole new level of intrigue when it comes to deep learning. This thread explores the different ways a data scientist can go about examining how a deep neural network model was initially designed. The responses range from practical to absurd, but each adds a layer of perspective which could help you one day if you ever face this predicament.

Debate on TensorFlow 2.0 API

My attention to this thread was drawn by the sheer number of comments (110 at the time of writing) – what in the world could be so controversial about this topic? But when you started scrolling down, the sheer difference in opinions among the debators is mind boggling. Apart from TensorFlow being derided for being “not the best framework”, there’s a lot of love being shown to PyTorch (which isn’t all that surprising if you’ve used PyTorch).

It all started when Francois Chollet posted his thoughts on GitHub and lit a (metaphorical) fire under the machine learning community.

Reinforcement Learning with Prediction-Based Rewards

Another OpenAI entry in this post – and yet another huge breakthrough by them. The title might not leap out of the page as anything special but it’s important to understand what the OpenAI team have conjured up here. As one of the Redditors pointed out, this takes us one step closer to machines mimicking human behavior.

It took around a year of total experience to beat the Montezuma’s Revenge game at a super human level – pretty impressive!

Landing that First Data Scientist Job

This one is for all the aspiring data scientists reading the article. The author of the thread expounds on how he landed the coveted job, his background, where he studied data science from, etc. After answering these standard questions, he has actually written a very nice post on what others in a similar position can do to further their ambitions.

There are some helpful comments as well if you scroll down a little bit. And of course, you can post your own question(s) to the author there.

End Notes

Quite a collection this month. I found the GAN Dissection repository quite absorbing. I’m currently in the process of trying to replicate it on my own machine – should be quite the ride. I’m also keeping an eye on the ‘Reverse Engineering a Massive Neural Network’ thread as the ideas spawning there could be really helpful in case I ever find myself in that situation.

Which GitHub repository and Reddit thread stood out for you? Which one will you tackle first? Let me know in the comments section below!

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Login to continue reading and enjoy expert-curated content.

Free Courses

Exploratory Data Analysis with Python & GenAI

Learn EDA with Python: Transform data into insights using PandasAI & more.

Data Science Course

Build a powerful 2026-ready data science resume using AI tools.

Understanding the working of Neural Networks

Learn the neural network basics, concepts, layers, and activation functions.

No Code Predictive Analytics with Orange

No-code AI course for business pros with real-world ML use cases.

Adaptive Email Agents with DSPy

Build adaptive email agents with DSPy using context and smart learning.

Responses From Readers

Kouassi Konan Jean-Claude

Thanks for sharing Pranav Dar. They are all new interesting repositories which will be useful to me too.

123 1
Pranav Dar

I'm glad you found it useful, Kouassi!

123 456

Flagship Programs

GenAI Pinnacle Program| GenAI Pinnacle Plus Program| AI/ML BlackBelt Program| Agentic AI Pioneer Program

Free Courses

Generative AI| DeepSeek| OpenAI Agent SDK| LLM Applications using Prompt Engineering| DeepSeek from Scratch| Stability.AI| SSM & MAMBA| RAG Systems using LlamaIndex| Building LLMs for Code| Python| Microsoft Excel| Machine Learning| Deep Learning| Mastering Multimodal RAG| Introduction to Transformer Model| Bagging & Boosting| Loan Prediction| Time Series Forecasting| Tableau| Business Analytics| Vibe Coding in Windsurf| Model Deployment using FastAPI| Building Data Analyst AI Agent| Getting started with OpenAI o3-mini| Introduction to Transformers and Attention Mechanisms

Popular Categories

AI Agents| Generative AI| Prompt Engineering| Generative AI Application| News| Technical Guides| AI Tools| Interview Preparation| Research Papers| Success Stories| Quiz| Use Cases| Listicles

Generative AI Tools and Techniques

GANs| VAEs| Transformers| StyleGAN| Pix2Pix| Autoencoders| GPT| BERT| Word2Vec| LSTM| Attention Mechanisms| Diffusion Models| LLMs| SLMs| Encoder Decoder Models| Prompt Engineering| LangChain| LlamaIndex| RAG| Fine-tuning| LangChain AI Agent| Multimodal Models| RNNs| DCGAN| ProGAN| Text-to-Image Models| DDPM| Document Question Answering| Imagen| T5 (Text-to-Text Transfer Transformer)| Seq2seq Models| WaveNet| Attention Is All You Need (Transformer Architecture) | WindSurf| Cursor

Popular GenAI Models

Llama 4| Llama 3.1| GPT 4.5| GPT 4.1| GPT 4o| o3-mini| Sora| DeepSeek R1| DeepSeek V3| Janus Pro| Veo 2| Gemini 2.5 Pro| Gemini 2.0| Gemma 3| Claude Sonnet 3.7| Claude 3.5 Sonnet| Phi 4| Phi 3.5| Mistral Small 3.1| Mistral NeMo| Mistral-7b| Bedrock| Vertex AI| Qwen QwQ 32B| Qwen 2| Qwen 2.5 VL| Qwen Chat| Grok 3

AI Development Frameworks

n8n| LangChain| Agent SDK| A2A by Google| SmolAgents| LangGraph| CrewAI| Agno| LangFlow| AutoGen| LlamaIndex| Swarm| AutoGPT

Data Science Tools and Techniques

Python| R| SQL| Jupyter Notebooks| TensorFlow| Scikit-learn| PyTorch| Tableau| Apache Spark| Matplotlib| Seaborn| Pandas| Hadoop| Docker| Git| Keras| Apache Kafka| AWS| NLP| Random Forest| Computer Vision| Data Visualization| Data Exploration| Big Data| Common Machine Learning Algorithms| Machine Learning| Google Data Science Agent
👁 Av Logo White

Continue your learning for FREE

Forgot your password?
👁 Av Logo White

Enter OTP sent to

Edit

Wrong OTP.

Enter the OTP

Resend OTP

Resend OTP in 45s

👁 Popup Banner
👁 AI Popup Banner