《计算机应用研究》|Application Research of Computers

基于强化学习的电动车路径优化研究

Research on electric vehicle routing problem based on reinforcement learning

免费全文下载 (已被下载 次)  
获取PDF全文
作者 胡尚民,沈惠璋
机构 上海交通大学 安泰经济与管理学院,上海 200030
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)11-006-3232-04
DOI 10.19734/j.issn.1001-3695.2019.07.0260
摘要 针对有路径总时长约束、载重量约束和电池容量约束的电动车路径优化问题(EVRP),考虑其途中可前往充电站充电的情境,构建以最小化路径总长度为目标的数学模型,提出一种基于强化学习的求解算法RL-EVRP。该算法用给定的分布生成训练数据,再通过策略梯度法训练模型,并保证在训练过程中路径合法即可。训练得到的模型可用于解决其他数据同分布的问题,无须重新训练。通过仿真实验及与其他算法的对比,表明RL-EVRP算法求解的路径总长度更短、车辆数更少,也表明强化学习可成功运用于较复杂的组合优化问题中。
关键词 车辆路径问题; 电动车; 多约束; 强化学习; 策略梯度法; 组合优化
基金项目
本文URL http://www.arocmag.com/article/01-2020-11-006.html
英文标题 Research on electric vehicle routing problem based on reinforcement learning
作者英文名 Hu Shangmin, Shen Huizhang
机构英文名 Antai College of Economics & Management,Shanghai Jiaotong University,Shanghai 200030,China
英文摘要 This paper took the electric vehicle routing problem(EVRP) with constraints of time, load and battery capacity as the research object, it considered its recharging need in transit, constructed a mathematic model aiming at minimizing the total route length, and proposed an algorithm RL-EVRP based on reinforcement learning. The algorithm generated instances sampled from a given distribution, and trained a model by applying a policy gradient method while keeping the route feasible. The trained model could solve other instances from similar distribution without the need to re-train. Simulation results show that the RL-EVRP can get shorter total route length and less number of vehicles and that the reinforcement learning can be applied to complicated combinatorial optimization problem successfully.
英文关键词 VRP; electric vehicle; multi-constraint; reinforcement learning; policy gradient method; combinatorial optimization
参考文献 查看稿件参考文献
 
收稿日期 2019/7/13
修回日期 2019/9/4
页码 3232-3235
中图分类号 TP399
文献标志码 A