Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework
Title: | Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework |
---|---|
Authors: | Zu, Weiqin, Song, Wenbin, Chen, Ruiqing, Guo, Ze, Sun, Fanglei, Tian, Zheng, Pan, Wei, Wang, Jun |
Publication Year: | 2023 |
Collection: | Computer Science |
Subject Terms: | Computer Science - Robotics |
More Details: | The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and -guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Extensive experiments are conducted in both simulation and the real world demonstrating that LIM2N has superior user needs understanding, alongside an enhanced interactive experience. |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2311.08244 |
Accession Number: | edsarx.2311.08244 |
Database: | arXiv |
FullText | Text: Availability: 0 CustomLinks: – Url: http://arxiv.org/abs/2311.08244 Name: EDS - Arxiv Category: fullText Text: View this record from Arxiv MouseOverText: View this record from Arxiv – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20231114&spage=&pages=&title=Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework&atitle=Language%20and%20Sketching%3A%20An%20LLM-driven%20Interactive%20Multimodal%20Multitask%20Robot%20Navigation%20Framework&aulast=Zu%2C%20Weiqin&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries |
---|---|
Header | DbId: edsarx DbLabel: arXiv An: edsarx.2311.08244 RelevancyScore: 1073 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 1073.15832519531 |
IllustrationInfo | |
Items | – Name: Title Label: Title Group: Ti Data: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Zu%2C+Weiqin%22">Zu, Weiqin</searchLink><br /><searchLink fieldCode="AR" term="%22Song%2C+Wenbin%22">Song, Wenbin</searchLink><br /><searchLink fieldCode="AR" term="%22Chen%2C+Ruiqing%22">Chen, Ruiqing</searchLink><br /><searchLink fieldCode="AR" term="%22Guo%2C+Ze%22">Guo, Ze</searchLink><br /><searchLink fieldCode="AR" term="%22Sun%2C+Fanglei%22">Sun, Fanglei</searchLink><br /><searchLink fieldCode="AR" term="%22Tian%2C+Zheng%22">Tian, Zheng</searchLink><br /><searchLink fieldCode="AR" term="%22Pan%2C+Wei%22">Pan, Wei</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Jun%22">Wang, Jun</searchLink> – Name: DatePubCY Label: Publication Year Group: Date Data: 2023 – Name: Subset Label: Collection Group: HoldingsInfo Data: Computer Science – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Robotics%22">Computer Science - Robotics</searchLink> – Name: Abstract Label: Description Group: Ab Data: The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and -guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Extensive experiments are conducted in both simulation and the real world demonstrating that LIM2N has superior user needs understanding, alongside an enhanced interactive experience. – Name: TypeDocument Label: Document Type Group: TypDoc Data: Working Paper – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2311.08244" linkWindow="_blank">http://arxiv.org/abs/2311.08244</link> – Name: AN Label: Accession Number Group: ID Data: edsarx.2311.08244 |
PLink | https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2311.08244 |
RecordInfo | BibRecord: BibEntity: Subjects: – SubjectFull: Computer Science - Robotics Type: general Titles: – TitleFull: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Zu, Weiqin – PersonEntity: Name: NameFull: Song, Wenbin – PersonEntity: Name: NameFull: Chen, Ruiqing – PersonEntity: Name: NameFull: Guo, Ze – PersonEntity: Name: NameFull: Sun, Fanglei – PersonEntity: Name: NameFull: Tian, Zheng – PersonEntity: Name: NameFull: Pan, Wei – PersonEntity: Name: NameFull: Wang, Jun IsPartOfRelationships: – BibEntity: Dates: – D: 14 M: 11 Type: published Y: 2023 |
ResultId | 1 |