Vision Transformer Empowered Emotion-Based Music Recommendation

Perera, Navindu

Vision Transformer Empowered Emotion-Based Music Recommendation

Perera, Navindu

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/2698

Date: 2024

Abstract:

"In today's digital age, music streaming has become a common form of entertainment. However, current recommendation systems frequently overlook users' real-time emotional states, resulting in a less personalized experience. This project solves this issue by creating an advanced music recommendation system that uses Vision Transformers (ViTs) to evaluate users' emotions through visuals and interfaces with Spotify's API to generate playlists based on the user's current mood. The technical solution to this problem involves the new application of ViTs, which are commonly employed in image processing tasks, to emotion recognition. This approach treated the facial expressions in user-provided photos as a series of patches, then used the transformer model to comprehend the contextual links between these patches. By training the ViT on a large dataset of facial expressions, the system can generate accurate predictions about a user's emotional state. Following deployment, the emotion detection algorithm underwent extensive testing to assess its efficacy. Using data science criteria like the confusion matrix, the system achieved a commendable 86% accuracy rate. This high accuracy reveals the model's ability to discern emotions, highlighting Vision Transformers' promise in the field of tailored music recommendations. The confusion matrix, an important tool in the evaluation, offered more information about the model's precision and recall across different emotions, confirming its reliability. This testing phase was critical for demonstrating the system's ability to transform music recommendation services by providing a more dynamic, intuitive, and individualized approach to matching music to listeners' emotional states."

Show full item record