Abstract:
"This study explores the fast-developing field of Environmental Sound Classification (ESC), which carries vital implications for urban planning and environmental monitoring. The core contribution of this study lies in the introduction and evaluation of a novel wavelet-based audio augmentation technique, a pioneering effort in the ESC domain. This technique not only enhances the accuracy and robustness of ESC models in the presence of noise but also serves as a ground-breaking foundation for future research.
Furthermore, the study explores robust feature combinations, demonstrating their potential to improve model resilience against background noise. The study finds that wavelet-based features show promising results in enhancing the accuracy of classic machine learning models compared to the same models trained without using augmented data. Therefore, the proposed wavelet-based data augmentation process holds immense potential to increase the accuracy of deep learning models, effectively addressing both data scarcity and noise robustness challenges, which is pivotal in advancing the field of ESC.
In addition, the study established a simple yet accurate classic benchmark model that can be used as a reference for future research, especially those that aim to refine pre-processing, augmentation, and feature extraction techniques in ESC."