Center for Social Information Sciences (CSIS) Seminar
Baxter 125
Finite Time Bounds for Robust Reinforcement Learning with Linear Function Approximation
Yashaswini Murthy,
Postdoctoral Scholar Research Associate in Computing & Mathematical Sciences,
Caltech,
Abstract: Robust reinforcement learning (RL) focuses on designing optimal policies from data for MDPs with model uncertainties. Existing convergence guarantees for robust RL are either limited to tabular settings or use restrictive assumptions in the function approximation setting. We will present an RL algorithm for learning the optimal policy from data in the function approximation setting and provide finite time sample complexity bounds without requiring generative access to the underlying MDP model. Our algorithm uses a combination of ideas from distributionally robust optimization (DPO), two time-scale stochastic approximation, and traditional (non-robust) fitted value iteration and Q-learning.
For more information, please contact Letty Diaz by phone at 626-395-1255 or by email at [email protected].