keyboard_arrow_up
Arabic Tweets Categorization Based on Rough Set Theory

Authors

Mohammed Bekkali and Abdelmonaime Lachkar, University Sidi Mohamed Ben Abdellah (USMBA), Morocco

Abstract

Twitter is a popular microblogging service where users create status messages (called “tweets”). These tweets sometimes express opinions about different topics; and are presented to the user in a chronological order. This format of presentation is useful to the user since the latest tweets from are rich on recent news which is generally more interesting than tweets about an event that occurred long time back. Merely, presenting tweets in a chronological order may be too embarrassing to the user, especially if he has many followers. Therefore, there is a need to separate the tweets into different categories and then present the categories to the user. Nowadays Text Categorization (TC) becomes more significant especially for the Arabic language which is one of the most complex languages. In this paper, in order to improve the accuracy of tweets categorization a system based on Rough Set Theory is proposed for enrichment the document’s representation. The effectiveness of our system was evaluated and compared in term of the F-measure of the Naïve Bayesian classifier and the Support Vector Machine classifier.

Keywords

Arabic Language, Text Categorization, Rough Set Theory, Twitter, Tweets.

Full Text  Volume 4, Number 11