keyboard_arrow_up
Max-Policy Sharing for Multi-Agent Reinforcement Learning in Autonomous Mobility on Demand

Authors

Ebtehal T. Alotaibi1,2 and Michael Herrmann1, 1University of Edinburgh, Edinburgh, United Kingdom, 2Imam Mohammad Ibn Saud Islamic University, Saudi Arabia

Abstract

Autonomous-Mobility-on-Demand (AMoD) systems can revolutionize urban transportation by providing mobility as a service without car ownership. However, optimizing the performance of AMoD systems presents a challenge due to competing objectives of reducing customer wait times and increasing system utilization while minimizing empty miles. To address this challenge, this study compares the performance of max-policy sharing agents and independent learners in an AMoD system using reinforcement learning. The results demonstrate the advantages of the max-policy sharing approach in improving Quality of Service (QoS) indicators such as completed orders, empty miles, lost customers due to competition, and out-of-charge events. The study identifies the importance of striking a balance between competition and cooperation among individual autonomous vehicles and tuning the frequency of policy sharing to avoid suboptimal policies. The findings suggest that the max-policy sharing approach has the potential to accelerate learning in multi-agent reinforcement learning systems, particularly under conditions of low exploration.

Keywords

Mulit-Agents, Reinforcement Learning, Consensus Learner, Max-Policy Sharing, Autonomous Mobility on Demand.

Full Text  Volume 13, Number 13