李卓博士后:RL-based Trajectory Planning for Efficient Exploration of Spatio-Temporal Fields
Academy of Mathematics and Systems Science, CAS Colloquia & Seminars
Speaker:
李卓博士后,北京理工大学
Inviter:
Title:
RL-based Trajectory Planning for Efficient Exploration of Spatio-Temporal Fields
Time & Venue:
2022.11.08 20:30-21:00 腾讯会议:470-224-494
Abstract:
This talk concerns trajectory planning problems of an autonomous vehicle for field exploration, where the vehicle needs to accumulate sufficient information on a spatio-temporal field and to reach a desired position as fast as possible. Due to the functional constraint of cumulative information, general optimization algorithms cannot be directly applied. To solve it, this work proposes a reinforcement learning (RL)-based method and trains a continuous planning policy, which is independent of any priori knowledge of the field. We adopt observability constant as the information measure to express the constraint of sufficient information along the vehicle's trajectory, and prove existence of an optimal solution of the constrained minimum-time trajectory planning problem under mild conditions. This problem is remodeled by a Markov decision process (MDP). An auxiliary reward term is particularly designed in the reward function via field approximations, which improves the reward density and accelerates the learning process of an optimal policy. The MDP is solved by the soft actor-critic algorithm. Simulations validate the effectiveness of the proposed method.