Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
Title: | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks |
---|---|
Authors: | Marra, Felipe, Ferreira, Lucas N. |
Publication Year: | 2024 |
Collection: | Computer Science |
Subject Terms: | Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Multimedia, Computer Science - Neural and Evolutionary Computing, Electrical Engineering and Systems Science - Audio and Speech Processing |
More Details: | This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack generation for Tabletop Role-Playing Games (TRPGs). We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. Four versions of Babel Bardo were compared in two TRPG campaigns: a baseline using direct speech transcriptions, and three LLM-based versions with varying approaches to music description generation. Evaluations considered audio quality, story alignment, and transition smoothness. Results indicate that detailed music descriptions improve audio quality while maintaining consistency across consecutive descriptions enhances story alignment and transition smoothness. Comment: Paper accepted at the LAMIR 2024 workshop |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2411.03948 |
Accession Number: | edsarx.2411.03948 |
Database: | arXiv |
FullText | Text: Availability: 0 CustomLinks: – Url: http://arxiv.org/abs/2411.03948 Name: EDS - Arxiv Category: fullText Text: View this record from Arxiv MouseOverText: View this record from Arxiv – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20241106&spage=&pages=&title=Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks&atitle=Long-Form%20Text-to-Music%20Generation%20with%20Adaptive%20Prompts%3A%20A%20Case%20of%20Study%20in%20Tabletop%20Role-Playing%20Games%20Soundtracks&aulast=Marra%2C%20Felipe&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries |
---|---|
Header | DbId: edsarx DbLabel: arXiv An: edsarx.2411.03948 RelevancyScore: 1128 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 1128.03063964844 |
IllustrationInfo | |
Items | – Name: Title Label: Title Group: Ti Data: Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Marra%2C+Felipe%22">Marra, Felipe</searchLink><br /><searchLink fieldCode="AR" term="%22Ferreira%2C+Lucas+N%2E%22">Ferreira, Lucas N.</searchLink> – Name: DatePubCY Label: Publication Year Group: Date Data: 2024 – Name: Subset Label: Collection Group: HoldingsInfo Data: Computer Science – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Sound%22">Computer Science - Sound</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Multimedia%22">Computer Science - Multimedia</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Neural+and+Evolutionary+Computing%22">Computer Science - Neural and Evolutionary Computing</searchLink><br /><searchLink fieldCode="DE" term="%22Electrical+Engineering+and+Systems+Science+-+Audio+and+Speech+Processing%22">Electrical Engineering and Systems Science - Audio and Speech Processing</searchLink> – Name: Abstract Label: Description Group: Ab Data: This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack generation for Tabletop Role-Playing Games (TRPGs). We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. Four versions of Babel Bardo were compared in two TRPG campaigns: a baseline using direct speech transcriptions, and three LLM-based versions with varying approaches to music description generation. Evaluations considered audio quality, story alignment, and transition smoothness. Results indicate that detailed music descriptions improve audio quality while maintaining consistency across consecutive descriptions enhances story alignment and transition smoothness.<br />Comment: Paper accepted at the LAMIR 2024 workshop – Name: TypeDocument Label: Document Type Group: TypDoc Data: Working Paper – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2411.03948" linkWindow="_blank">http://arxiv.org/abs/2411.03948</link> – Name: AN Label: Accession Number Group: ID Data: edsarx.2411.03948 |
PLink | https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2411.03948 |
RecordInfo | BibRecord: BibEntity: Subjects: – SubjectFull: Computer Science - Sound Type: general – SubjectFull: Computer Science - Artificial Intelligence Type: general – SubjectFull: Computer Science - Multimedia Type: general – SubjectFull: Computer Science - Neural and Evolutionary Computing Type: general – SubjectFull: Electrical Engineering and Systems Science - Audio and Speech Processing Type: general Titles: – TitleFull: Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Marra, Felipe – PersonEntity: Name: NameFull: Ferreira, Lucas N. IsPartOfRelationships: – BibEntity: Dates: – D: 06 M: 11 Type: published Y: 2024 |
ResultId | 1 |