Urban and rural disparities in stroke prediction using machine learning among Chinese older adults

Bibliographic Details
Title: Urban and rural disparities in stroke prediction using machine learning among Chinese older adults
Authors: Jingjing Zhu, Luotao Lin, Lei Si, Hailei Zhao, Hualing Song, Xianglong Xu
Source: Scientific Reports, Vol 15, Iss 1, Pp 1-9 (2025)
Publisher Information: Nature Portfolio, 2025.
Publication Year: 2025
Collection: LCC:Medicine
LCC:Science
Subject Terms: Stroke, Prediction, Machine learning, Urban and rural disparities, Middle-aged and elderly adults, Medicine, Science
More Details: Abstract Stroke is a significant health concern in China. Differences in stroke risk between rural and urban areas have been highlighted in prior research. However, there is a scarcity of studies on urban-rural differences in predicting stroke. This study aimed to develop stroke prediction models, and urban-rural subgroup analyses were conducted to explore disparities in determinants among middle-aged and older adults. We employed nine machine learning algorithms, namely logistic regression (LR), adaptive boosting classifier, support vector machines, extreme gradient boosting, random forest, Gaussian naive Bayes (GNB), gradient boosting machine, light gradient boosting decision machine, and K Nearest Neighbours, using data derived from 9,413 individuals aged 45 years and above obtained from the China Health and Retirement Longitudinal Study (CHARLS) conducted in 2011 to build stroke prediction models and analyze urban-rural subgroups. In the total population, GNB (AUC = 0.76) was the best model for predicting strokes, and the ten most important variables were the time taken for repeated chair stands, the chair height from floor to seat, knee height, creatinine, complete repeated chair stands, mean corpuscular volume, platelet, uric acid, body mass index, and white blood cell. In the rural subgroup, LR and GNB (AUC = 0.76) were the best, and the ten most important variables were the time taken for repeated chair stands, creatinine, platelet, the chair height from floor to seat, knee height, complete repeated chair stands, pulse, white blood cell, maintaining semi - tandem balance statically, and uric acid. In the urban subgroup, LR (AUC = 0.67) was the best, and the ten most important variables were the time taken for repeated chair stands, mean corpuscular volume, maintaining semi - tandem balance statically, uric acid, right-hand grip strength, age, blood urea nitrogen, use of trunk, arms, legs for semi - tandem balance, number of marriages, and night sleep duration. The time taken for repeated chair stands was more critical in the stroke risk model for rural individuals. Uric acid and maintaining semi - tandem balance statically were more critical in the stroke risk model for urban individuals. Our results revealed the importance of knee height and physical function predictors for stroke and highlighted the differences in determinants between urban and rural individuals, proposing targeted stroke prevention and control strategies in different populations in terms of physical function.
Document Type: article
File Description: electronic resource
Language: English
ISSN: 2045-2322
Relation: https://doaj.org/toc/2045-2322
DOI: 10.1038/s41598-025-91157-y
Access URL: https://doaj.org/article/efc40e9819ab453e86d240501a5f4458
Accession Number: edsdoj.fc40e9819ab453e86d240501a5f4458
Database: Directory of Open Access Journals
Full text is not displayed to guests.
More Details
ISSN:20452322
DOI:10.1038/s41598-025-91157-y
Published in:Scientific Reports
Language:English