Weak-Annotation of HAR Datasets using Vision Foundation Models

Bibliographic Details
Title: Weak-Annotation of HAR Datasets using Vision Foundation Models
Authors: Bock, Marius, Van Laerhoven, Kristof, Moeller, Michael
Publication Year: 2024
Collection: Computer Science
Subject Terms: Computer Science - Human-Computer Interaction, Computer Science - Computer Vision and Pattern Recognition
More Details: As wearable-based data annotation remains, to date, a tedious, time-consuming task requiring researchers to dedicate substantial time, benchmark datasets within the field of Human Activity Recognition in lack richness and size compared to datasets available within related fields. Recently, vision foundation models such as CLIP have gained significant attention, helping the vision community advance in finding robust, generalizable feature representations. With the majority of researchers within the wearable community relying on vision modalities to overcome the limited expressiveness of wearable data and accurately label their to-be-released benchmark datasets offline, we propose a novel, clustering-based annotation pipeline to significantly reduce the amount of data that needs to be annotated by a human annotator. We show that using our approach, the annotation of centroid clips suffices to achieve average labelling accuracies close to 90% across three publicly available HAR benchmark datasets. Using the weakly annotated datasets, we further demonstrate that we can match the accuracy scores of fully-supervised deep learning classifiers across all three benchmark datasets. Code as well as supplementary figures and results are publicly downloadable via github.com/mariusbock/weak_har.
Comment: 8 pages, 3 figures, accepted at ISWC'24: International Symposium on Wearable Computers, Oct, 2024
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2408.05169
Accession Number: edsarx.2408.05169
Database: arXiv
More Details
Description not available.