Abstract:
"Infancy is the period of life within the first year since birth. Within this time, a baby is called an infant. While the period of infancy has the most rapid growth, children are born with certain abilities already developed. The most prominent of these abilities is the ability to communicate.
Although babies can indicate if they are hungry, uncomfortable, sleepy, and so forth in their own non-adult way, their parent or caretaker might face difficulties trying to understand what their infant is trying to tell them.
This research project aims to introduce a novel implementation to recognize moods, emotions, and communication intentions of infants to their respective caretakers using vocal and facial expressions. This is achieved by harnessing deep learning capabilities, combined with computer vision and audio processing in the form of convolutional neural networks, or CNNs to identify the child’s mood and emotion.
This study manages to look at a new architecture to facilitate multi-modality in the form of
video and audio which results in inferences sent to the user in real time. This research concludes it will assist caregivers and parents in the decision-making process when it comes to calming down their crying child."