Authors
Xuemei Li, Oakland University, USA
Abstract
Perception is a very critical and challenging task in the realm of autonomous driving. The current approach relies on a sophisticated model pipeline built upon various deep learning models, each tasked with solving distinct challenges. This leads to a large model size. The proposed approach dissects an entire driving scene into two distinct elements: driving backgrounds that are not for vehicles to drive onto and road segments that are for vehicles driving. It performs the pointwise fusion using disparity image and RGB image. It uses pointwise and depthwise convolution to reduce multiplication times. It integrates image segmentation neural networks Deeplab V3 as backbone and significantly reduces the model size using ResNet-18. The efficacy of the proposed neural network is substantiated through validation using the Cityscape dataset, yielding an impressive 0.979 accuracy, 0.948 precision, and a 0.947 F1-score. Furthermore, it boasts a training speed that is five times faster compared to conventional UNet-based models, and the model size is eight times smaller than UNet-based models.
Keywords
Image Segmentation, Image Perception, Object Detection, UNet, Deeplab V3, ResNet-18