Abstract:
"Without a custom thumbnail, a video is most likely to get lower ranks in video hosting sites when there are millions of videos to watch. Finding thumbnail inspiration can be a tedious task for a content creator as this includes watching the entire video to get an idea of what the video is about.
This project focuses on getting thumbnails from the transcript of the video by extracting words that summarizes the video using KeyBERT, and then matching (k-NN) it with the image processing for object detection (Darknet YOLOv4). This project achieves good accuracy where the predicted thumbnails get similar thumbnails as the actual thumbnail."