Navigating Electric Vehicles Along a Signalized Corridor via Reinforcement Learning: Toward Adaptive Eco-Driving Control

Transportation Research Record - Tập 2676 Số 8 - Trang 657-669 - 2022
Jian Zhang1,2,3,4, Xia Jiang1,2,4, Suping Cui3, Can Yang3, Bin Ran5,2
1Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing, Jiangsu, China
2Research Center for Internet of Mobility, Southeast University, Nanjing, Jiangsu, China
3School of Engineering, Tibet University, Lhasa, Tibet, China
4School of Transportation, Southeast University, Nanjing, Jiangsu, China
5Department of Civil and Environmental Engineering, University of Wisconsin–Madison, Madison, WI

Tóm tắt

One problem associated with the operation of electric vehicles (EVs) is the limited battery, which cannot guarantee their endurance. The increasing electricity consumption will also impose a burden on economy and ecology of the vehicles. To achieve energy saving, this paper proposes an adaptive eco-driving method in the environment of signalized corridors. The framework with adaptive and real-time control is implemented by the reinforcement learning technique. First, the operation of EVs in the proximity of intersections is defined as a Markov Decision Process (MDP) to apply the twin delayed deep deterministic policy gradient (TD3) algorithm, to deal with the decision process with continuous action space. Therefore, the speed of the vehicle can be adjusted continuously. Second, safety, traffic mobility, energy consumption, and comfort are all considered by designing a comprehensive reward function for the MDP. Third, the simulation study takes Aoti Street in Nanjing City with several consecutive signalized intersections as the research road network, and the state representation in MDP considers the information from consecutive downstream traffic signals. After the parameter tuning procedure, simulations are carried out for three typical eco-driving scenarios, including free flow, car following, and congestion flow. By comparing with default car-following behavior in the simulation platform SUMO and several state-of-the-art deep reinforcement learning algorithms, the proposed strategy shows a balanced and stable performance.

Từ khóa


Tài liệu tham khảo

10.1061/JTEPBS.0000318

10.3141/2427-06

10.1016/j.trd.2009.01.004

10.1177/0361198119839960

10.1177/0361198120931848

10.1016/j.apenergy.2008.12.017

10.1016/j.trf.2017.01.002

10.1109/ACCESS.2019.2922227

10.1049/iet-its.2014.0145

10.1016/j.trd.2018.07.014

10.1109/TITS.2015.2422778

10.1155/2020/1609834

10.1109/ITSC.2015.207

10.1080/21680566.2017.1365661

10.1061/9780784481523.036

10.1109/TITS.2019.2942050

10.1016/j.trc.2018.05.025

10.1080/15472450.2012.712494

10.1186/s12544-020-00458-y

10.1016/j.trc.2018.06.002

10.1109/TVT.2019.2931519

10.1109/TITS.2019.2911607

10.1109/LRA.2021.3062007

10.1038/nature14236

10.1177/0361198118796939

Hao P., 2020, Developing an Adaptive Strategy for Connected Eco-Driving Under Uncertain Traffic and Signal Conditions

Wang H., Zhang J., Dong S., Tang F., Ban X. J., Ran B. Multi-Agent Reinforcement Learning Method for Arterial Traffic Signal Control With Delayed Reward. Presented at 99th Annual Meeting of the Transportation Research Board, Washington, D.C., 2020.

10.1016/j.trc.2021.102980

10.1109/ITSC.2018.8569938

Sutton R. S., 2018, Reinforcement Learning: An Introduction

10.1007/978-3-662-45079-6_3

10.1016/j.trf.2014.09.006

10.1016/j.aap.2021.106157

Zhang J., 2020, Journal of Advanced Transportation, 2020, 5847814

10.1016/j.aap.2019.105345

Hayward J.C., 1972, Highway Research Record

10.1016/j.aap.2017.04.025

10.1016/j.aap.2017.05.002

Fujimoto S., 2018, Proc., International Conference on Machine Learning (PMLR), 1587

Silver D., 2014, Proc., International Conference on Machine Learning

Schaul T., 2015, arXiv Preprint arXiv:1511.05952

10.1109/ITEC.2019.8790491

10.1049/iet-its.2014.0177