Abstract:
The increasing demand for toddler-oriented online education has highlighted a gap in web-based platforms that integrate interactive learning with speech recognition. While most existing solutions are mobile applications, they often lack the enhanced engagement and accessibility that web-based platforms can provide. This project presents an end-to-end web-based e-learning system specifically designed for toddlers, supporting the learning of basic concepts such as numbers, colours, letters, and shapes, while also offering progress-tracking features for parents and teachers.
To enable accurate speech recognition for toddler users, a Convolutional Neural Network (CNN) was designed and trained using a personalised dataset of toddler speech samples. Audio data were pre-processed using Mel-Frequency Cepstral Coefficients (MFCCs) to extract essential spectral features. Following multiple experimental iterations, the final CNN architecture consisted of convolutional and pooling layers, followed by fully connected dense layers with dropout for regularisation. The model was optimised to classify speech inputs into predefined learning categories, enabling real-time interactive learning experiences.
The evaluation results demonstrate a classification accuracy of 90.02% and an AUC-ROC score of 0.89, indicating strong performance in toddler speech recognition. Confusion matrix analysis revealed high sensitivity and specificity, with precision and recall values consistently exceeding 88%. These results confirm the effectiveness of the proposed model for speech-based learning and highlight its potential for integration into intelligent, interactive, web-based educational platforms for early childhood learning.