Authors
Sheetal Dabra1, Sunil Agrawal2 and Rama Krishna Challa1, 2N.I.T.T.T.R, India and 2Punjab University, India
Abstract
The growing need of handwritten Hindi character recognition in Indian offices such as passport, railway etc, has made it a vital area of research. Similar shaped characters are more prone to misclassification. In this paper four Machine Learning (ML) algorithms namely Bayesian Network, Radial Basis Function Network (RBFN), Multilayer Perceptron (MLP), and C4.5 Decision Tree are used for recognition of Similar Shaped Handwritten Hindi Characters (SSHHC) and their performance is compared. A novel feature set of 85 features is generated on the basis of character geometry. Due to the high dimensionality of feature vector, the classifiers can be computationally complex. So, its dimensionality is reduced to 11 and 4 using Correlation-Based (CFS) and Consistency-Based (CON) feature selection techniques respectively. Experimental results show that Bayesian Networkis a better choice when used with CFS while C4.5 gives better performance with CON features.
Keywords
Character Recognition, Feature Extraction, Machine Learning, Feature Selection.