4.9 out of 5
4.9
15 reviews on Udemy

Deep Reinforcement Learning 2.0

The smartest combination of Deep Q-Learning, Policy Gradient, Actor Critic, and DDPG
Instructor:
Hadelin de Ponteves
474 students enrolled
English [Auto-generated]
Q-Learning
Deep Q-Learning
Policy Gradient
Actor Critic
Deep Deterministic Policy Gradient (DDPG)
Twin-Delayed DDPG (TD3)
The Foundation Techniques of Deep Reinforcement Learning
How to implement a state of the art AI model that is over performing the most challenging virtual applications

Welcome to Deep Reinforcement Learning 2.0!

In this course, we will learn and implement a new incredibly smart AI model, called the Twin-Delayed DDPG, which combines state of the art techniques in Artificial Intelligence including continuous Double Deep Q-Learning, Policy Gradient, and Actor Critic. The model is so strong that for the first time in our courses, we are able to solve the most challenging virtual AI applications (training an ant/spider and a half humanoid to walk and run across a field).

To approach this model the right way, we structured the course in three parts:

  • Part 1: Fundamentals
    In this part we will study all the fundamentals of Artificial Intelligence which will allow you to understand and master the AI of this course. These include Q-Learning, Deep Q-Learning, Policy Gradient, Actor-Critic and more.

  • Part 2: The Twin-Delayed DDPG Theory
    We will study in depth the whole theory behind the model. You will clearly see the whole construction and training process of the AI through a series of clear visualization slides. Not only will you learn the theory in details, but also you will shape up a strong intuition of how the AI learns and works. The fundamentals in Part 1, combined to the very detailed theory of Part 2, will make this highly advanced model accessible to you, and you will eventually be one of the very few people who can master this model.

  • Part 3: The Twin-Delayed DDPG Implementation
    We will implement the model from scratch, step by step, and through interactive sessions, a new feature of this course which will have you practice on many coding exercises while we implement the model. By doing them you will not follow passively the course but very actively, therefore allowing you to effectively improve your skills. And last but not least, we will do the whole implementation on Colaboratory, or Google Colab, which is a totally free and open source AI platform allowing you to code and train some AIs without having any packages to install on your machine. In other words, you can be 100% confident that you press the execute button, the AI will start to train and you will get the videos of the spider and humanoid running in the end.

Part 1 - Fundamentals

1
Welcome
2
Some resources before we start
3
Q-Learning
4
Deep Q-Learning
5
Policy Gradient
6
Actor-Critic
7
Taxonomy of AI models

Part 2 - Twin Delayed DDPG Theory

1
Introduction and Initialization
2
The Q-Learning part
3
The Policy Learning part
4
The whole training process

Part 3 - Twin Delayed DDPG Implementation

1
The whole code folder of the course with all the implementations
2
Beginning
3
Implementation - Step 1
4
Implementation - Step 2
5
Implementation - Step 3
6
Implementation - Step 4
7
Implementation - Step 5
8
Implementation - Step 6
9
Implementation - Step 7
10
Implementation - Step 8
11
Implementation - Step 9
12
Implementation - Step 10
13
Implementation - Step 11
14
Implementation - Step 12
15
Implementation - Step 13
16
Implementation - Step 14
17
Implementation - Step 15
18
Implementation - Step 16
19
Implementation - Step 17
20
Implementation - Step 18
21
Implementation - Step 19
22
Implementation - Step 20

The Final Demo!

1
Demo - Training
2
Demo - Inference

Annex 1 - Artificial Neural Networks

1
Plan of Attack
2
The Neuron
3
The Activation Function
4
How do Neural Networks Work?
5
How do Neural Networks Learn?
6
Gradient Descent
7
Stochastic Gradient Descent
8
Backpropagation

Annex 2 - Q-Learning

1
Plan of Attack
2
What is Reinforcement Learning?
3
The Bellman Equation
4
The Plan
5
Markov Decision Process
6
Policy vs Plan
7
Living Penalty
8
Q-Learning Intuition
9
Temporal Difference
10
Q-Learning Visualization

Annex 3 - Deep Q-Learning

1
Plan of Attack
2
Deep Q-Learning Intuition - Step 1
3
Deep Q-Learning Intuition - Step 2
4
Experience Replay
5
Action Selection Policies
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.9
4.9 out of 5
15 Ratings

Detailed Rating

Stars 5
13
Stars 4
2
Stars 3
0
Stars 2
0
Stars 1
0
4cf5f95091045e2b0742d6c7c6fe6296
30-Day Money-Back Guarantee

Includes

10 hours on-demand video
2 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion