More Details: |
Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to an unlabeled target domain with no access to the source data. Inspired by the success of large Vision-Language (ViL) models in many applications, the latest research has validated ViL's benefit for SFDA by using their predictions as pseudo supervision. However, we observe that ViL's supervision could be noisy and inaccurate at an unknown rate, introducing additional negative effects during adaption. To address this thus-far ignored challenge, we introduce a novel Proxy Denoising (ProDe) approach. The key idea is to leverage the ViL model as a proxy to facilitate the adaptation process towards the latent domain-invariant space. We design a proxy denoising mechanism to correct ViL's predictions, grounded on a proxy confidence theory that models the dynamic effect of proxy's divergence against the domain-invariant space during adaptation. To capitalize on the corrected proxy, we derive a mutual knowledge distilling regularization. Extensive experiments show that ProDe significantly outperforms current state-of-the-art alternatives under the conventional closed set setting and more challenging open set, partial set, generalized SFDA, multi-target, multi-source, and test-time settings. Our code and data are available at https://github.com/tntek/source-free-domain-adaptation. Comment: This paper is accepted by ICLR 2025 (Oral, Top 1.8%) |