Harmony Ai: Automated System for Solving Class  Imbalance in Tabular Datasets

Sandanayaka, Oshani

Harmony Ai: Automated System for Solving Class Imbalance in Tabular Datasets

Sandanayaka, Oshani

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/3066

Date: 2025

Abstract:

Class imbalance is a major challenge in machine learning, especially in tabular datasets where critical minority classes are often underrepresented. This imbalance can lead to biased models that disproportionately favor majority classes, compromising predictive accuracy and reliability, particularly in scenarios where accurate minority class detection is essential. To address this issue, this thesis presents Harmony AI, an automated and adaptive system that intelligently identifies class imbalance patterns and recommends the most suitable resampling techniques based on dataset characteristics. Leveraging a trained meta-learning model and domain-agnostic feature engineering, Harmony AI dynamically selects from techniques such as SMOTE, random undersampling, and others to optimize performance. The system is designed for ease of use, enabling users with little to no machine learning expertise to upload, process, and export balanced datasets through a user-friendly interface. Extensive testing across diverse tabular datasets demonstrates that Harmony AI consistently improves recall while maintaining high precision, outperforming manual resampling methods in most cases. The proposed framework contributes to automated machine learning (AutoML) by offering a general, efficient, and interpretable solution to class imbalance in real-world data.

Show full item record