Digital Repository

GameAudioLDM – Finetuned Text-To-Audio Generation for Sound Effects in the Game Development Industry

Show simple item record

dc.contributor.author Pallemulla, Asith
dc.date.accessioned 2025-06-18T05:18:54Z
dc.date.available 2025-06-18T05:18:54Z
dc.date.issued 2024
dc.identifier.citation Pallemulla, Asith (2024) GameAudioLDM – Finetuned Text-To-Audio Generation for Sound Effects in the Game Development Industry. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20200853
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2638
dc.description.abstract "Sound effects have become an extremely important measure of a game’s quality in today’s game development industry and are integral to players’ reception of a game product. Despite this industry standard, sourcing high quality and creatively accurate sound effects requires expensive audio engineers, or much time spent consolidating free assets. For smaller developers with low budgets that make up the majority of the industry, these are not viable options, and will often have to sacrifice game quality by settling for sound effect assets that do not match their creative vision. This presents a need for a Text-to-Audio system that can generate custom game sound effect assets matching the developer’s exact request. Existing AI solutions for generating audio assets are unsuitable for the game development industry. They are too slow to rapidly regenerate assets for creative testing, often give unusable outputs, and require manual audio editing in the best of cases. They have not been adopted into the industry due to these reasons, even by small developers who require such a solution. The proposed project will overcome this by using an existing Text-to-Audio generative model as a base and adapting the output to meet the common needs of the game development industry using audio manipulation techniques within a new audio post-processing module. The new system must be simpler to use, have much faster batch outputs for repeated testing, increase audio quality, ensure generated assets are atomic and do not require manual editing to be used in games, and fulfill game developers’ other auditory goals. The author believes these improvements will allow the generative model to be used effectively in the industry and surpass existing solutions. Initial test results have been positive, notably showing a marked increase in the perceived speed, yield, quality, and suitability. Subjective metrics including OVL and REL have also been positive. " en_US
dc.language.iso en en_US
dc.subject Audio synthesis en_US
dc.subject Audio processing en_US
dc.subject Game development en_US
dc.title GameAudioLDM – Finetuned Text-To-Audio Generation for Sound Effects in the Game Development Industry en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account