Deep Reinforcement Learning Fused to Model Predictive Control

Description
With reinforcement learning (RL), model-free methods train an agent to behave appropriately in the environment through rewards and punishments based on its action. When the RL algorithm contain a deep neural networks to represent the value function and/or the action-value function, we can denominate it as deep reinforcement learning (DRL). Model Predictive Control (MPC), on the other hand, is an optimization problem that considers a prediction horizon and needs to satisfy a set of constraints, which can be used to select the best action for the agent in advance of future events. With this integrated framework, the agent can, for example, learn the best (action-)value function while minimizing the time or path to reach a goal. Also, integrating the dynamical model of the agent to the DRL through the MPC allows the framework to include constrains for the agent's states and actions. An application to design motion and path planning for an autonomous car is expected, avoiding obstacles while respecting the constraints of the vehicle.
Goals
- To integrate a deep reinforcement algorithm, e.g. DQN, with a model predictive controller to select the best action based on a predictive horizon.
- To study efficient heuristics for optimization of non-linear cost functions, as deep neural networks.
- To investigate the exploration vs exploitation paradigm, possibly substituting the ε-greedy policy to the MPC.
- To develop convergence proofs for the integrated framework.
- To extend the DRL-MPC to include constraints for the states and actions of the agent.
- To implement an application to motion planning and path planning for autonomous driving systems.
References
- Roza, F. S.; Azizpour, M.; Bajcinca, N. End-to-End Autonomous Driving Controller Using Semantic Segmentationand Variational Autoencoder. International Conference on Control, Decision and Information Technologies (CoDIT 20), Prague, July 2020.
Keywords
Reinforcement learning
Deep learning
Model Predictive Control
Autonomous driving
Contact
M.Sc. Matheus Pedrosa
Gottlieb-Daimler-Str. 42
67663, Kaiserslautern
matheus.pedrosa(at)mv.uni-kl.de
Funding
State of Rhineland-Palatinate
Time span
Since 2018