Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks

Bibliographic Details
Title: Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
Authors: Marra, Felipe, Ferreira, Lucas N.
Publication Year: 2024
Collection: Computer Science
Subject Terms: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Multimedia, Computer Science - Neural and Evolutionary Computing, Electrical Engineering and Systems Science - Audio and Speech Processing
More Details: This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack generation for Tabletop Role-Playing Games (TRPGs). We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. Four versions of Babel Bardo were compared in two TRPG campaigns: a baseline using direct speech transcriptions, and three LLM-based versions with varying approaches to music description generation. Evaluations considered audio quality, story alignment, and transition smoothness. Results indicate that detailed music descriptions improve audio quality while maintaining consistency across consecutive descriptions enhances story alignment and transition smoothness.
Comment: Paper accepted at the LAMIR 2024 workshop
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2411.03948
Accession Number: edsarx.2411.03948
Database: arXiv
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://arxiv.org/abs/2411.03948
    Name: EDS - Arxiv
    Category: fullText
    Text: View this record from Arxiv
    MouseOverText: View this record from Arxiv
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20241106&spage=&pages=&title=Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks&atitle=Long-Form%20Text-to-Music%20Generation%20with%20Adaptive%20Prompts%3A%20A%20Case%20of%20Study%20in%20Tabletop%20Role-Playing%20Games%20Soundtracks&aulast=Marra%2C%20Felipe&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edsarx
DbLabel: arXiv
An: edsarx.2411.03948
RelevancyScore: 1128
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 1128.03063964844
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Marra%2C+Felipe%22">Marra, Felipe</searchLink><br /><searchLink fieldCode="AR" term="%22Ferreira%2C+Lucas+N%2E%22">Ferreira, Lucas N.</searchLink>
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2024
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Computer Science
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Sound%22">Computer Science - Sound</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Multimedia%22">Computer Science - Multimedia</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Neural+and+Evolutionary+Computing%22">Computer Science - Neural and Evolutionary Computing</searchLink><br /><searchLink fieldCode="DE" term="%22Electrical+Engineering+and+Systems+Science+-+Audio+and+Speech+Processing%22">Electrical Engineering and Systems Science - Audio and Speech Processing</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack generation for Tabletop Role-Playing Games (TRPGs). We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. Four versions of Babel Bardo were compared in two TRPG campaigns: a baseline using direct speech transcriptions, and three LLM-based versions with varying approaches to music description generation. Evaluations considered audio quality, story alignment, and transition smoothness. Results indicate that detailed music descriptions improve audio quality while maintaining consistency across consecutive descriptions enhances story alignment and transition smoothness.<br />Comment: Paper accepted at the LAMIR 2024 workshop
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Working Paper
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2411.03948" linkWindow="_blank">http://arxiv.org/abs/2411.03948</link>
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsarx.2411.03948
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2411.03948
RecordInfo BibRecord:
  BibEntity:
    Subjects:
      – SubjectFull: Computer Science - Sound
        Type: general
      – SubjectFull: Computer Science - Artificial Intelligence
        Type: general
      – SubjectFull: Computer Science - Multimedia
        Type: general
      – SubjectFull: Computer Science - Neural and Evolutionary Computing
        Type: general
      – SubjectFull: Electrical Engineering and Systems Science - Audio and Speech Processing
        Type: general
    Titles:
      – TitleFull: Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Marra, Felipe
      – PersonEntity:
          Name:
            NameFull: Ferreira, Lucas N.
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 06
              M: 11
              Type: published
              Y: 2024
ResultId 1