Seminars

Research and development - Seminars

Policy evaluation under Markovian noise::

using Online Bootstrap Inference algorithm

The evaluation of policies in Reinforcement Learning is studied in scenarios of large dimension or with uncertainty. In this case, the value of the policy to be evaluated is approximated linearly, and is developed using Stochastic Linear Approximation with Markovian noise. The classical methods, Time Differences and Gradients of Time Differences, are inefficient in estimating the value function. Therefore, we study the alternative offered by the Online Bootstrap Inference algorithm, which promises to be an improvement to the existing methods.

Details:

Exhibitor:

Ana Maria Patron Piñerez

Date:

August 10, 2023

Play Video