TY - JOUR
T1 - Sociodemographic indicators of health status using a machine learning approach and data from the english longitudinal study of aging (ELSA)
AU - Engchuan, Worrawat
AU - Dimopoulos, Alexandros C.
AU - Tyrovolas, Stefanos
AU - Caballero, Francisco Félix
AU - Sanchez-Niubo, Albert
AU - Arndt, Holger
AU - Ayuso-Mateos, Jose Luis
AU - Haro, Josep Maria
AU - Chatterji, Somnath
AU - Panagiotakos, Demosthenes B.
N1 - Funding Information:
Demosthenes B. Panagiotakos, e-mail: [email protected] The ATHLOS project has received funding from the European Union Horizon 2020 Research and Innovation Program under grant agreement No. 635316 (EU HORIZON2020-PHC-635316)
Funding Information:
The ATHLOS project has received funding from the European Union Horizon 2020 Research and Innovation Program under grant agreement No. 635316 (EU HORIZON2020-PHC-635316).
Publisher Copyright:
© Med Sci Monit, 2019.
PY - 2019
Y1 - 2019
N2 - Background: Studies on the effects of sociodemographic factors on health in aging now include the use of statistical models and machine learning. The aim of this study was to evaluate the determinants of health in aging using machine learning methods and to compare the accuracy with traditional methods. Material/Methods: The health status of 6,209 adults, age <65 years (n=1,585), 65–79 years (n=3,267), and >80 years (n=1,357) were measured using an established health metric (0–100) that incorporated physical function and activities of daily living (ADL). Data from the English Longitudinal Study of Ageing (ELSA) included socio-economic and sociodemographic characteristics and history of falls. Health-trend and personal-fitted variables were generated as predictors of health metrics using three machine learning methods, random forest (RF), deep learning (DL) and the linear model (LM), with calculation of the percentage increase in mean square error (%IncMSE) as a measure of the importance of a given predictive variable, when the variable was removed from the model. Results: Health-trend, physical activity, and personal-fitted variables were the main predictors of health, with the%incMSE of 85.76%, 63.40%, and 46.71%, respectively. Age, employment status, alcohol consumption, and household income had the%incMSE of 20.40%, 20.10%, 16.94%, and 13.61%, respectively. Performance of the RF method was similar to the traditional LM (p=0.7), but RF significantly outperformed DL (p=0.006). Conclusions: Machine learning methods can be used to evaluate multidimensional longitudinal health data and may provide accurate results with fewer requirements when compared with traditional statistical modeling.
AB - Background: Studies on the effects of sociodemographic factors on health in aging now include the use of statistical models and machine learning. The aim of this study was to evaluate the determinants of health in aging using machine learning methods and to compare the accuracy with traditional methods. Material/Methods: The health status of 6,209 adults, age <65 years (n=1,585), 65–79 years (n=3,267), and >80 years (n=1,357) were measured using an established health metric (0–100) that incorporated physical function and activities of daily living (ADL). Data from the English Longitudinal Study of Ageing (ELSA) included socio-economic and sociodemographic characteristics and history of falls. Health-trend and personal-fitted variables were generated as predictors of health metrics using three machine learning methods, random forest (RF), deep learning (DL) and the linear model (LM), with calculation of the percentage increase in mean square error (%IncMSE) as a measure of the importance of a given predictive variable, when the variable was removed from the model. Results: Health-trend, physical activity, and personal-fitted variables were the main predictors of health, with the%incMSE of 85.76%, 63.40%, and 46.71%, respectively. Age, employment status, alcohol consumption, and household income had the%incMSE of 20.40%, 20.10%, 16.94%, and 13.61%, respectively. Performance of the RF method was similar to the traditional LM (p=0.7), but RF significantly outperformed DL (p=0.006). Conclusions: Machine learning methods can be used to evaluate multidimensional longitudinal health data and may provide accurate results with fewer requirements when compared with traditional statistical modeling.
KW - Artificial intelligence
KW - Data interpretation, statistical
KW - Decision support techniques
KW - Socioeconomic factors
UR - http://www.scopus.com/inward/record.url?scp=85063258613&partnerID=8YFLogxK
U2 - 10.12659/MSM.913283
DO - 10.12659/MSM.913283
M3 - Journal article
C2 - 30879019
AN - SCOPUS:85063258613
SN - 1234-1010
VL - 25
SP - 1994
EP - 2001
JO - Medical Science Monitor
JF - Medical Science Monitor
ER -