DiagramQG: Concept-Focused Diagram Question Generation via Hierarchical Knowledge Integration

Bibliographic Details
Title:	DiagramQG: Concept-Focused Diagram Question Generation via Hierarchical Knowledge Integration
Authors:	Zhang, Xinyu, Zhang, Lingling, Wu, Yanrui, Huang, Muye, Wu, Wenjun, Li, Bo, Wang, Shaowei, Fernando, Basura, Liu, Jun
Publication Year:	2024
Collection:	Computer Science
Subject Terms:	Computer Science - Computer Vision and Pattern Recognition
More Details:	Visual Question Generation (VQG) has gained significant attention due to its potential in educational applications. However, VQG research mainly focuses on natural images, largely neglecting diagrams in educational materials used to assess students' conceptual understanding. To address this gap, we construct DiagramQG, a dataset containing 8,372 diagrams and 19,475 questions across various subjects. DiagramQG introduces concept and target text constraints, guiding the model to generate concept-focused questions for educational purposes. Meanwhile, we present the Hierarchical Knowledge Integration framework for Diagram Question Generation (HKI-DQG) as a strong baseline. This framework obtains multi-scale patches of diagrams and acquires knowledge using a visual language model with frozen parameters. It then integrates knowledge, text constraints, and patches to generate concept-focused questions. We evaluate the performance of existing VQG models, open-source and closed-source vision-language models, and HKI-DQG on the DiagramQG dataset. Our novel HKI-DQG consistently outperforms existing methods, demonstrating that it serves as a strong baseline. Furthermore, we apply HKI-DQG to four other VQG datasets of natural images, namely VQG-COCO, K-VQG, OK-VQA, and A-OKVQA, achieving state-of-the-art performance.
Document Type:	Working Paper
Access URL:	http://arxiv.org/abs/2411.17771
Accession Number:	edsarx.2411.17771
Database:	arXiv

More Details
Description not available.