Authors
Sanaz Hadipour Abkenar and Alireza Ahmadyfard, Shahrood University of Technology, Iran
Abstract
Maps convey valuable information by relating names to their positions. In this paper we present a new method for text extraction from raster maps using color space quantization. Previously, most researches in this field were focused on Latin texts and the results for Persian or Arabic texts were poor. In our proposed method we use a Mean-Shift algorithm with proper parameter adjustment and consequently, we apply color transformation to make the maps ready for KMeans algorithm which quantizes the colors in maps to six levels. By comparing to a threshold the text layer candidates are then limited to three. The best layer can afterwards be chosen by user. This method is independent of font size, direction and the color of the text and can find both Latin and Persian/Arabic texts in maps. Experimental results show a significant improvement in Persian text extraction.
Keywords
Color space conversion, K-Means clustering, Mean-Shift algorithm, Quantization, Text extraction.