Comparative Performance Analysis of Metaheuristic Feature Selection Methods for Speech Emotion Recognition
Turgut Ozseven, Mustafa Arpacioglu- Instrumentation
- Biomedical Engineering
- Control and Systems Engineering
Abstract
Emotion recognition systems from speech signals are realized with the help of acoustic or spectral features. Acoustic analysis is the extraction of digital features from speech files using digital signal processing methods. Another method is the analysis of time-frequency images of speech using image processing. The size of the features obtained by acoustic analysis is in the thousands. Therefore, classification complexity increases and causes variation in classification accuracy. In feature selection, features unrelated to emotions are extracted from the feature space and are expected to contribute to the classifier performance. Traditional feature selection methods are mostly based on statistical analysis. Another feature selection method is the use of metaheuristic algorithms to detect and remove irrelevant features from the feature set. In this study, we compare the performance of metaheuristic feature selection algorithms for speech emotion recognition. For this purpose, a comparative analysis was performed on four different datasets, eight metaheuristics and three different classifiers. The results of the analysis show that the classification accuracy increases when the feature size is reduced. For all datasets, the highest accuracy was achieved with the support vector machine. The highest accuracy for the EMO-DB, EMOVA, eNTERFACE’05 and SAVEE datasets is 88.1%, 73.8%, 73.3% and 75.7%, respectively.