keyboard_arrow_up
Synthetical Enlargement of MFCC Based Training Sets for Emotion Recognition

Authors

Inma Mohino-Herranz1, Roberto Gil-Pita1, Sagrario Alonso-Diaz2 and Manuel Rosa-Zurera1, 1University of Alcala, Spain and 2Technological Institute "La Maraiosa", Spain

Abstract

Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficult, due to the generalization problems that arise under these conditions. In this work we propose a solution to this problem consisting in enlarging the training set through the creation the new virtual patterns. In the case of emotional speech, most of the emotional information is included in speed and pitch variations. So, a change in the average pitch that does not modify neither the speed nor the pitch variations does not affect the expressed emotion. Thus, we use this prior information in order to create new patterns applying a pitch shift modification in the feature extraction process of the classification system. For this purpose, we propose a frequency scaling modification of the Mel Frequency Cepstral Coefficients, used to classify the emotion. This proposed process allows us to synthetically increase the number of available patterns in thetraining set, thus increasing the generalization capability of the system and reducing the test error.

Keywords

Enlarged training set, MFCC, emotion recognition, pitch analysis

Full Text  Volume 4, Number 1