Seminars

Research and development - Seminars

Stochastic Gradient and Stochastic Approximation:

Applied to Q-learning

The project is motivated to demonstrate the convergence of Q-learning. This is an algorithm applied to finite Markov Decision Processes in discrete time, where there is not enough information. Thus, what the algorithm seeks is to solve the optimality equations (or Bellman equations). With this purpose in mind, in the project we discuss four things mainly:

discrete-time finite Markov decision processes, which is the model that we are interested in from the beginning.
Stochastic approximation (SA), which is the algorithm that serves as the general framework for many algorithms, including Q-learning. Under some assumptions we will succeed in establishing the convergence of AE.
Stochastic gradient descent method, which is the main tool by which the convergence of the AE algorithm (and many of the Machine Learning algorithms) can be established.
Reinforcement Learning, which is the branch in which the Q-learning algorithm is located. We allow ourselves to see this algorithm as a particular case of AE.

Applications to complete Markov Decision Processes, and solutions to find optimal strategies in games of chance.

Details:

Exhibitor:

José Sebastian Ñungo Manrrique

Date:

August 27, 2020

Play Video