Reinforcement Learning. Instructor: Prof. Balaraman Ravindran, Department of Computer Science and Engineering, IIT Madras.

0

Created by
OpenCoursa

September 26, 2023

UK

FREE

This course includes:

Unlimited Duration

Badge on Completion

Certificate of completion

Unlimited Duration

0

(

ratings

)

students

Created by:

Profile Photo

Last updated:

September 26, 2023

Duration:

Unlimited Duration

FREE

This course includes:

Unlimited Duration

Badge on Completion

Certificate of completion

Unlimited Duration

Description

Reinforcement Learning. Instructor: Prof. Balaraman Ravindran, Department of Computer Science and Engineering, IIT Madras.

 Reinforcement learning is a paradigm that aims to model the trial-and-error learning process that is needed in many problem situations where explicit instructive signals are not available. It has roots in operations research, behavioral psychology and AI. The goal of the course is to introduce the basic mathematical foundations of reinforcement learning, as well as highlight some of the recent directions of research. (from nptel.ac.in)

Course Curriculum

    • Lecture 01 – Probability Basics 1 Unlimited
    • Lecture 02 – Probability Basics 2 Unlimited
    • Lecture 03 – Linear Algebra 1 Unlimited
    • Lecture 04 – Linear Algebra 2 Unlimited
    • Lecture 05 – Introduction to RL Unlimited
    • Lecture 06 – RL Framework and Applications Unlimited
    • Lecture 07 – Introduction to Immediate RL Unlimited
    • Lecture 08 – Bandit Optimalities Unlimited
    • Lecture 09 – Value Function based Methods Unlimited
    • Lecture 10 – Upper Confidence Bound 1 (UCB 1) Unlimited
    • Lecture 11 – Concentration Bounds Unlimited
    • Lecture 12 – UCB 1 Theorem Unlimited
    • Lecture 13 – Probably Approximately Correct (PAC) Bounds Unlimited
    • Lecture 14 – Median Elimination Unlimited
    • Lecture 15 – Thompson Sampling Unlimited
    • Lecture 16 – Policy Search Unlimited
    • Lecture 17 – REINFORCE Unlimited
    • Lecture 18 – Contextual Bandits Unlimited
    • Lecture 19 – Full RL Introduction Unlimited
    • Lecture 20 – Returns, Value Functions and Markov Decision Processes (MDPs) Unlimited
    • Lecture 21 – MDP Modelling Unlimited
    • Lecture 22 – Bellman Equation Unlimited
    • Lecture 23 – Bellman Optimality Equation Unlimited
    • Lecture 24 – Cauchy Sequence and Green’s Equation Unlimited
    • Lecture 25 – Banach Fixed Point Theorem Unlimited
    • Lecture 26 – Convergence Proof Unlimited
    • Lecture 27 – Lpi Convergence Unlimited
    • Lecture 28 – Value Iteration Unlimited
    • Lecture 29 – Policy Iteration Unlimited
    • Lecture 30 – Dynamic Programming Unlimited
    • Lecture 31 – Monte Carlo Unlimited
    • Lecture 32 – Control in Monte Carlo Unlimited
    • Lecture 33 – Off Policy MC Unlimited
    • Lecture 34 – UCT (Upper Confidence Bound 1 applied to Trees) Unlimited
    • Lecture 35 – TD (0) Unlimited
    • Lecture 36 – TD (0) Control Unlimited
    • Lecture 37 – Q-Learning Unlimited
    • Lecture 38 – Afterstate Unlimited
    • Lecture 39 – Eligibility Traces Unlimited
    • Lecture 40 – Backward View of Eligibility Traces Unlimited
    • Lecture 41 – Eligibility Trace Control Unlimited
    • Lecture 42 – Thompson Sampling Recap Unlimited
    • Lecture 43 – Function Approximation Unlimited
    • Lecture 44 – Linear Parameterization Unlimited
    • Lecture 45 – State Aggregation Methods Unlimited
    • Lecture 46 – Function Approximation and Eligibility Traces Unlimited
    • Lecture 47 – Least-Squares Temporal Difference (LSTD) and LSTDQ Unlimited
    • Lecture 48 – LSPI and Fitted Q Unlimited
    • Lecture 49 – DQN and Fitted Q-Iteration Unlimited
    • Lecture 50 – Policy Gradient Approach Unlimited
    • Lecture 51 – Actor Critic and REINFORCE Unlimited
    • Lecture 52 – REINFORCE (cont.) Unlimited
    • Lecture 53 – Policy Gradient with Function Approximation Unlimited
    • Lecture 54 – Hierarchical Reinforcement Learning Unlimited
    • Lecture 55 – Types of Optimality Unlimited
    • Lecture 56 – Semi Markov Decision Processes Unlimited
    • Lecture 57 – Options Unlimited
    • Lecture 58 – Learning with Options Unlimited
    • Lecture 59 – Hierarchical Abstract Machines Unlimited
    • Lecture 60 – MAXQ Unlimited
    • Lecture 61 – MAXQ Value Function Decomposition Unlimited
    • Lecture 62 – Option Discovery Unlimited
    • Lecture 63 – POMDP Introduction Unlimited
    • Lecture 64 – Solving POMDP Unlimited

About the instructor

5 5

Instructor Rating

6

Reviews

4637

Courses

24151

Students

Profile Photo
OpenCoursa
We are an educational and skills marketplace to accommodate the needs of skills enhancement and free equal education across the globe to the millions. We are bringing courses and trainings every single day for our users. We welcome everyone woth all ages, all background to learn. There is so much available to learn and deliver to the people.