Max-Policy Sharing for Multi-Agent Reinforcement Learning in Autonomous Mobility on Demand

Ebtehal T. Alotaibi1,2 and Michael Herrmann1, 1University of Edinburgh, Edinburgh, United Kingdom, 2Imam Mohammad Ibn Saud Islamic University, Saudi Arabia; Ebtehal T. Alotaibi1,2 and Michael Herrmann1, 1University of Edinburgh, Edinburgh, United Kingdom, 2Imam Mohammad Ibn Saud Islamic University, Saudi Arabia

Max-Policy Sharing for Multi-Agent Reinforcement Learning in Autonomous Mobility on Demand

Authors

Ebtehal T. Alotaibi^1,2 and Michael Herrmann¹, ¹University of Edinburgh, Edinburgh, United Kingdom, ²Imam Mohammad Ibn Saud Islamic University, Saudi Arabia

Abstract

Autonomous-Mobility-on-Demand (AMoD) systems can revolutionize urban transportation by providing mobility as a service without car ownership. However, optimizing the performance of AMoD systems presents a challenge due to competing objectives of reducing customer wait times and increasing system utilization while minimizing empty miles. To address this challenge, this study compares the performance of max-policy sharing agents and independent learners in an AMoD system using reinforcement learning. The results demonstrate the advantages of the max-policy sharing approach in improving Quality of Service (QoS) indicators such as completed orders, empty miles, lost customers due to competition, and out-of-charge events. The study identifies the importance of striking a balance between competition and cooperation among individual autonomous vehicles and tuning the frequency of policy sharing to avoid suboptimal policies. The findings suggest that the max-policy sharing approach has the potential to accelerate learning in multi-agent reinforcement learning systems, particularly under conditions of low exploration.

Keywords

Mulit-Agents, Reinforcement Learning, Consensus Learner, Max-Policy Sharing, Autonomous Mobility on Demand.

CS&IT Conference Proceedings

Max-Policy Sharing for Multi-Agent Reinforcement Learning in Autonomous Mobility on Demand