Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework

Bibliographic Details
Title: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework
Authors: Zu, Weiqin, Song, Wenbin, Chen, Ruiqing, Guo, Ze, Sun, Fanglei, Tian, Zheng, Pan, Wei, Wang, Jun
Publication Year: 2023
Collection: Computer Science
Subject Terms: Computer Science - Robotics
More Details: The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and -guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Extensive experiments are conducted in both simulation and the real world demonstrating that LIM2N has superior user needs understanding, alongside an enhanced interactive experience.
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2311.08244
Accession Number: edsarx.2311.08244
Database: arXiv
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://arxiv.org/abs/2311.08244
    Name: EDS - Arxiv
    Category: fullText
    Text: View this record from Arxiv
    MouseOverText: View this record from Arxiv
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20231114&spage=&pages=&title=Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework&atitle=Language%20and%20Sketching%3A%20An%20LLM-driven%20Interactive%20Multimodal%20Multitask%20Robot%20Navigation%20Framework&aulast=Zu%2C%20Weiqin&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edsarx
DbLabel: arXiv
An: edsarx.2311.08244
RelevancyScore: 1073
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 1073.15832519531
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Zu%2C+Weiqin%22">Zu, Weiqin</searchLink><br /><searchLink fieldCode="AR" term="%22Song%2C+Wenbin%22">Song, Wenbin</searchLink><br /><searchLink fieldCode="AR" term="%22Chen%2C+Ruiqing%22">Chen, Ruiqing</searchLink><br /><searchLink fieldCode="AR" term="%22Guo%2C+Ze%22">Guo, Ze</searchLink><br /><searchLink fieldCode="AR" term="%22Sun%2C+Fanglei%22">Sun, Fanglei</searchLink><br /><searchLink fieldCode="AR" term="%22Tian%2C+Zheng%22">Tian, Zheng</searchLink><br /><searchLink fieldCode="AR" term="%22Pan%2C+Wei%22">Pan, Wei</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Jun%22">Wang, Jun</searchLink>
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2023
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Computer Science
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Robotics%22">Computer Science - Robotics</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and -guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Extensive experiments are conducted in both simulation and the real world demonstrating that LIM2N has superior user needs understanding, alongside an enhanced interactive experience.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Working Paper
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2311.08244" linkWindow="_blank">http://arxiv.org/abs/2311.08244</link>
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsarx.2311.08244
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2311.08244
RecordInfo BibRecord:
  BibEntity:
    Subjects:
      – SubjectFull: Computer Science - Robotics
        Type: general
    Titles:
      – TitleFull: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Zu, Weiqin
      – PersonEntity:
          Name:
            NameFull: Song, Wenbin
      – PersonEntity:
          Name:
            NameFull: Chen, Ruiqing
      – PersonEntity:
          Name:
            NameFull: Guo, Ze
      – PersonEntity:
          Name:
            NameFull: Sun, Fanglei
      – PersonEntity:
          Name:
            NameFull: Tian, Zheng
      – PersonEntity:
          Name:
            NameFull: Pan, Wei
      – PersonEntity:
          Name:
            NameFull: Wang, Jun
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 14
              M: 11
              Type: published
              Y: 2023
ResultId 1