Abstract: Telehealth is a promising avenue for prevention, and remote diagnosis and monitoring of diseases. However, in non-clinical contexts sensors may fail, leading to incomplete data, where features are not missing randomly, that can jeopardize subsequent analyses. In this work, we investigate if imputation schemes can improve the analysis of relationships between Blood Pressure (BP) and features
collected by a stand-alone telehealth kiosk. We analyze 253 samples of 26 features, corresponding to indicators of BP, oximetry, and body composition. We use Multiple Imputation Denoising Autoencoders (MIDA) to impute missing values, with masks imitating the patterns of failing sensors in the training. In terms of mean absolute error, MIDA performs as well as mean imputation. Only MIDA is able to capture and preserve the distribution of the data. Principal Component Analysis performed on the imputed dataset suggests body weight, muscle mass and energy requirements explain 20% of data’s variability, and arterial stiffness and pulse amplitude explain 15%. While Partial Least Square regression draws attention to the role of arterial stiffness, oxygen saturation, and pulse amplitude in predicting BP on the incomplete data, it also highlights the importance of body weight and muscle mass in the outcome after imputation, thus improving the analysis.
Keywords: Telehealth; Health Kiosk; Missing Values; Blood Pressure
Hotelling, H. (1993). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.
Gondara L. and Wang K. (2018). Mida: Multiple imputation using denoising autoencoders. Pacific-Asia conference on knowledge discovery and data mining (pp. 260-272). Springer, Cham.
Wold S., Sjöström M., and Eriksson, L (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and intelligent laboratory systems, 58(2), 109-130.
Fichier attaché | Taille |
---|---|
Hava Chaptoukaevmercredi29juin2022.pdf | 944.4 Ko |