V-CEM: Bridging Performance and Intervenability in Concept-based Models

Bibliographic Details
Title: V-CEM: Bridging Performance and Intervenability in Concept-based Models
Authors: De Santis, Francesco, Ciravegna, Gabriele, Bich, Philippe, Giordano, Danilo, Cerquitelli, Tania
Publication Year: 2025
Collection: Computer Science
Subject Terms: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
More Details: Concept-based eXplainable AI (C-XAI) is a rapidly growing research field that enhances AI model interpretability by leveraging intermediate, human-understandable concepts. This approach not only enhances model transparency but also enables human intervention, allowing users to interact with these concepts to refine and improve the model's performance. Concept Bottleneck Models (CBMs) explicitly predict concepts before making final decisions, enabling interventions to correct misclassified concepts. While CBMs remain effective in Out-Of-Distribution (OOD) settings with intervention, they struggle to match the performance of black-box models. Concept Embedding Models (CEMs) address this by learning concept embeddings from both concept predictions and input data, enhancing In-Distribution (ID) accuracy but reducing the effectiveness of interventions, especially in OOD scenarios. In this work, we propose the Variational Concept Embedding Model (V-CEM), which leverages variational inference to improve intervention responsiveness in CEMs. We evaluated our model on various textual and visual datasets in terms of ID performance, intervention responsiveness in both ID and OOD settings, and Concept Representation Cohesiveness (CRC), a metric we propose to assess the quality of the concept embedding representations. The results demonstrate that V-CEM retains CEM-level ID performance while achieving intervention effectiveness similar to CBM in OOD settings, effectively reducing the gap between interpretability (intervention) and generalization (performance).
Comment: Paper accepted at: The 3rd World Conference on Explainable Artificial Intelligence
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2504.03978
Accession Number: edsarx.2504.03978
Database: arXiv
More Details
Description not available.