GameAudioLDM – Finetuned Text-To-Audio Generation for Sound  Effects in the Game Development Industry

Pallemulla, Asith

GameAudioLDM – Finetuned Text-To-Audio Generation for Sound Effects in the Game Development Industry

Pallemulla, Asith

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/2638

Date: 2024

Abstract:

"Sound effects have become an extremely important measure of a game’s quality in today’s game development industry and are integral to players’ reception of a game product. Despite this industry standard, sourcing high quality and creatively accurate sound effects requires expensive audio engineers, or much time spent consolidating free assets. For smaller developers with low budgets that make up the majority of the industry, these are not viable options, and will often have to sacrifice game quality by settling for sound effect assets that do not match their creative vision. This presents a need for a Text-to-Audio system that can generate custom game sound effect assets matching the developer’s exact request. Existing AI solutions for generating audio assets are unsuitable for the game development industry. They are too slow to rapidly regenerate assets for creative testing, often give unusable outputs, and require manual audio editing in the best of cases. They have not been adopted into the industry due to these reasons, even by small developers who require such a solution. The proposed project will overcome this by using an existing Text-to-Audio generative model as a base and adapting the output to meet the common needs of the game development industry using audio manipulation techniques within a new audio post-processing module. The new system must be simpler to use, have much faster batch outputs for repeated testing, increase audio quality, ensure generated assets are atomic and do not require manual editing to be used in games, and fulfill game developers’ other auditory goals. The author believes these improvements will allow the generative model to be used effectively in the industry and surpass existing solutions. Initial test results have been positive, notably showing a marked increase in the perceived speed, yield, quality, and suitability. Subjective metrics including OVL and REL have also been positive. "

Show full item record