Music Genre Classification with Multiple Spectral Audio Features and Lyrics

Senadeera, Janitha

Home
→
Dissertations & Thesis
→
MSc Bigdata Analytics
→
2024
→
View Item

Music Genre Classification with Multiple Spectral Audio Features and Lyrics

Senadeera, Janitha

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/2810

Date: 2024

Abstract:

"Seamless access to online entertainment platforms has been immensely improved over the past decade with the advancement of information technology. Also, people now have access to online entertainment platforms at any time from everywhere, and they are used to browsing and listening to music content from these platforms. However, searching for music content, based on their music preferences, is a hot topic nowadays. Listeners and content creators are struggling when using these platforms because of the abundance of music content are added day-by-day, most of the time people are stickler to a particular music genre category and sometimes they would like to hear different genres according to the situation they are at. And entertainment platforms should have efficient algorithms to label music content by analysing their features to tag them, to be searched by the platform users to provide a better experience. So, many researchers have come up with different approaches such as, analysing lyrics with natural language processing techniques, audio analysis using signal processing techniques, and even creating hybrid models using both audio and textual content of the music to classify their genres. Nowadays, utilizing multi-modal learning has also become a famous topic because it has overcome the obstacle of training different types of data that cannot be trained in a single deep learning model, and different types of data bring different features that would be essential to improve the efficiency of deep learning models. So, this study is about developing a multi-modal, multi-class classifier using audio content of songs using audio processing techniques and utilizing lyrics content by training a Deep Neural Network model using contextual embedding to represent text to provide more accurate results. Finally, to build a system that is capable of classifying music content more accurately based on their genre using textual and audio features. "

Show full item record