Supervisor

Vikas Tomer

Programme

MSc in Data Analytics

Subject

Computer Science

Abstract

This research focuses on Music Genre Classification (MGC) using Convolutional Neural Networks (CNNs) and various datasets, including raw audio files (WAV) and extracted features such as Mel Spectrograms (MS), Mel-Frequency Cepstral Coefficients (MFCC), and Chroma Features (CF). The study employs Explanatory Sequential Mixed Methods (ESMM), combining qualitative research and experimental analysis to explore different model inputs and their performance. Several CNN-based models, including 2D CNN, 2D CNN-LSTM, 1D CNN, and 1D CNN-LSTM, were tested. However, the models generally underperformed, with most achieving accuracy of 10% or lower, and the best model (raw audio 1D CNN) reaching only 20%. The research discusses troubleshooting, model limitations, and future recommendations, including potential reasons for the low performance compared to related works.

Date of Award

2024

Full Publication Date

2024

Access Rights

open access

Document Type

Capstone Project

Resource Type

thesis

Share

COinS