Back to activities
    
       
  
  
        
            GERAD seminar
        
        
    Regularization and Robustness in Reinforcement Learning
Esther Derman – MILA, Canada

Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods. However, this significantly increases computational complexity and limits scalability in both learning and planning. On the other hand, regularized MDPs show more stability in policy learning without impairing time complexity. Yet, they generally do not encompass uncertainty in the model dynamics. In this talk, I will show how we can learn robust MDPs using proper regularization, so as to reduce planning and learning in robust MDPs to regularized MDPs.
 
            
              Erick Delage
              organizer
            
           
            
              Pierre-Luc Bacon
              organizer
            
          Location
          
              Hybrid activity at GERAD
          
          
      Zoom et salle 4488
Pavillon André-Aisenstadt
Campus de l'Université de Montréal
2920, chemin de la Tour
Montréal Québec H3T 1J4
Canada
        Pavillon André-Aisenstadt
Campus de l'Université de Montréal
2920, chemin de la Tour
Montréal Québec H3T 1J4
Canada