Authors
Hyun Woo Jung, Hankuk Academy of Foreign Studies, Republic of Korea
Abstract
Deep learning has facilitated major advancements in various fields including image detection. This paper is an exploratory study on improving the performance of Convolutional Neural Network (CNN) models in environments with limited computing resources, such as the Raspberry Pi. A pretrained state-of-art algorithm for doing near-real time object detection in videos, YOLO (“You-Only-Look-Once”) CNN model, was selected for evaluating strategies for optimizing the runtime performance. Various performance analysis tools provided by the Linux kernel were used to measure CPU time and memory footprint. Our results show that loop parallelization, static compilation of weights, and flattening of convolution layers reduce the total runtime by 85% and reduce memory footprint by 53% on a Raspberry Pi 3 device. These findings suggest that the methodological improvements proposed in this work can reduce the computational overload of running CNN models on devices with limited computing resources.
Keywords
Deep Learning, Convolutional Neural Networks, Raspberry Pi, real-time object detection