Authors
Sara Al-Osimi1 and Ghada Badr2, 1Shaqra University, KSA and 2The City of Scientific Research and Technological Applications, Egypt
Abstract
Many studies uses different data mining techniques to analyze mass spectrometry data and extract useful knowledge about biomarkers. These Biomarkers allow the medical experts to determine whether an individual has a disease or not. Some of these studies have proposed models that have obtained high accuracy. However, the black-box nature and complexity of the proposed models have posed significant issues. Thus, to address this problem and build an accurate model, we use a genetic algorithm for feature selection along with a rule-based classifier, namely Genetic Rule-Based Classifier algorithm for Mass Spectra data (GRC-MS). According to the literature, rule-based classifiers provide understandable rules, but not accurate. In addition, genetic algorithms have achieved excellent results when used with different classifiers for feature selection. Experiments are conducted on real dataset and the proposed classifier GRC-MS achieves 99.7% accuracy. In addition, the generated rules are more understandable than those of other classifier models.
Keywords
Mass spectrometry, data mining, biomarkers, rule-based classifier, genetic algorithm.