Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework

Bibliographic Details
Title:	Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework
Authors:	Yang, Yuming, Zhong, Jiang, Jin, Li, Huang, Jingwang, Gao, Jingpeng, Liu, Qing, Bai, Yang, Zhang, Jingyuan, Jiang, Rui, Wei, Kaiwen
Publication Year:	2025
Collection:	Computer Science
Subject Terms:	Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
More Details:	Multimodal Retrieval-Augmented Generation (MRAG) enhances reasoning capabilities by integrating external knowledge. However, existing benchmarks primarily focus on simple image-text interactions, overlooking complex visual formats like charts that are prevalent in real-world applications. In this work, we introduce a novel task, Chart-based MRAG, to address this limitation. To semi-automatically generate high-quality evaluation samples, we propose CHARt-based document question-answering GEneration (CHARGE), a framework that produces evaluation data through structured keypoint extraction, crossmodal verification, and keypoint-based generation. By combining CHARGE with expert validation, we construct Chart-MRAG Bench, a comprehensive benchmark for chart-based MRAG evaluation, featuring 4,738 question-answering pairs across 8 domains from real-world documents. Our evaluation reveals three critical limitations in current approaches: (1) unified multimodal embedding retrieval methods struggles in chart-based scenarios, (2) even with ground-truth retrieval, state-of-the-art MLLMs achieve only 58.19% Correctness and 73.87% Coverage scores, and (3) MLLMs demonstrate consistent text-over-visual modality bias during Chart-based MRAG reasoning. The CHARGE and Chart-MRAG Bench are released at https://github.com/Nomothings/CHARGE.git.
Document Type:	Working Paper
Access URL:	http://arxiv.org/abs/2502.14864
Accession Number:	edsarx.2502.14864
Database:	arXiv

More Details
Description not available.