Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature

Bibliographic Details
Title: Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature
Authors: Vrba, Jan, Steinbach, Jakub, Jirsa, Tomáš, Verde, Laura, De Fazio, Roberta, Zeng, Yuwen, Ichiji, Kei, Hájek, Lukáš, Sedláková, Zuzana, Urbániová, Zuzana, Chovanec, Martin, Mareš, Jan, Homma, Noriyasu
Publication Year: 2024
Collection: Computer Science
Subject Terms: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
More Details: Purpose: We introduce a novel methodology for voice pathology detection using the publicly available Saarbr\"ucken Voice Database (SVD) and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and NaN feature (failed fundamental frequency estimation). Methods: We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost -- using grid search for feasible hyperparameters and 20480 different feature subsets. Top 1000 classification models -- feature subset combinations for each ML algorithm are validated with repeated stratified cross-validation. To address class imbalance, we apply K-Means SMOTE to augment the training data. Results: Our approach achieves 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectively. We intentionally omit accuracy as it is a highly biased metric for imbalanced data. Conclusion: Our study demonstrates that by following the proposed methodology and feature engineering, there is a potential in detection of various voice pathologies using ML models applied to the simplest vocal task, a sustained utterance of the vowel /a:/. To enable easier use of our methodology and to support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide a REFORMS checklist to enhance readability, reproducibility and justification of our approach
Comment: Code repository: https://github.com/aailab-uct/Automated-Robust-and-Reproducible-Voice-Pathology-Detection, Supplementary materials: https://doi.org/10.5281/zenodo.14793017
Document Type: Working Paper
DOI: 10.1016/j.jvoice.2025.03.028
Access URL: http://arxiv.org/abs/2410.10537
Accession Number: edsarx.2410.10537
Database: arXiv
More Details
DOI:10.1016/j.jvoice.2025.03.028