Empowering Communication: The Evolution and Potential of Speech Recognition System

No Thumbnail Available
Authors
Tiwari, Vaibhav
Issue Date
2024
Degree
MSc Data Analytics
Publisher
Dublin Business School
Rights
Abstract
This project explores the classification of emotions in speech data using neural network models. The project aims to build robust models to correctly identify emotions such as happiness, sadness, anger, fear, and disgust from audio clips. The process involves data collection, cleaning up, and feature extraction using Mel-frequency Cepstral Coefficients (MFCCs) data, creating, and assessing three unique neural network structures. The first model, a sequence structure with dense layers, is used to compare the performances of the subsequent models. The second model, which uses an upgraded dataset, performed better. The third model had mixed results when adding an LSTM layer, compared to models only using dense layers. The results underline the importance of varied data in improving emotion identification accuracy from speech data. This research brings essential knowledge to the field of neural network-based emotion classification. It sets the foundation for valuable applications in monitoring areas, including mental health and interactions between humans and computers.