Cardiovascular diseases prediction by integrated risk factors assessment by means of machine learning

DOI: https://doi.org/10.29296/25877305-2020-05-08
Download full text PDF
Issue: 
5
Year: 
2020

D. Gavrilov(1); L. Serova(1), Candidate of Engineering Sciences; I. Korsakov(1), Candidate of Physico-Mathematical Sciences; A. Gusev(1), Candidate of Engineering Sciences; R. Novitsky(1); T. Kuznetsova(2), MD (1)K-SkAI, Petrozavodsk (2)Petrozavodsk State University

Aim. To develop a model by machine learning to predict the risk of cardiovascular diseases (CVD) and validate the model using Russian medical data. Materials and methods. The data set was obtained from the Framingham study, consisting of 4,363 patients without CVD, 852 (19.5%) of which died of myocardial infarction and stroke within 10 years of observation. Incoming model features: gender, age, systolic blood pressure, cholesterol, smoking, body mass index, heart rate. The original data set was divided into 2 parts: the training data set (80% of the records) and the validate data set (the remaining 20%). Additionally, the model was evaluated by an external data set included 411 depersonalized patient data from the Russian citizens. Results. The WML.CVD.Score model was created by the serial neural network with one input, two hidden and one output layer. Accuracy results on a training dataset: Accuracy 81.15%, AUC 0.80. The same indicators on the validate data set were: Accuracy 81.1%, AUC 0.76. Test results for the test data set: Accuracy 79.07, AUC 0.86. On the Russian test data, the AUC for the SCORE scale was 0.81 versus 0.86 for the developed model, which showed the validity of the use of machine learning in order to increase the predictive model. Conclusion. The developed model has demonstrated high accuracy to CVD predicting in both internal and external validation. The model can be used in medical practice for patients in Russia.

Keywords: 
cardiology
cardiovascular diseases
risk factors
risk modeling
machine learning



It appears your Web browser is not configured to display PDF files. Download adobe Acrobat или click here to download the PDF file.

References: 
  1. WHO Global Action Plan for the Prevention and Control of Non-communicable Diseases 2013–2020 (resolution WHA66.10, 27 May 2013) Available at: http://apps.who.int/gb/ebwha/pdf_files/WHA66/A66_R10-en.pdf?ua=1 [Accessed 27 Mar. 2020].
  2. Federal'naja sluzhba gosudarstvennoj statistiki [Federal State Statistic Service (in Russ.)]. Available at: https://gks.ru/folder/13721 [Accessed 27 Mar. 2020].
  3. Pasport natsional'nogo proekta «Zdravoohranenie» (utv. Prezidiumom Soveta pri Prezidente RF po strategicheskomu razvitiju i natsional'nym proektam, protokol ot 24.12.2018 №6) [Passport of the national project «Healthcare» (in Russ.)] Available at: http://www.consultant.ru [Accessed 27 Mar. 2020].
  4. Shljahto E.V., Zvartau N.E., Villeval'de C.V. i dr. Sistema upravlenija serdechno-sosudistymi riskami: predposylki k sozdaniju, printsipy organizatsii, targetnye gruppy. Ros. kardiol. zhurn. 2019; 24 (11): 69–82 [Shlyakhto E.V., Zvartau N.E., Villevalde S.V. et al. Cardiovascular risk management system: prerequisites for developing, organization principles, target groups. Russian Journal of Cardiology. 2019; 24 (11): 69–82 (in Russ.)]. DOI: 10.15829/1560-4071-2019-11-69-82
  5. Beljalov F.I. Shkaly prognoza serdechno-sosudistyh zabolevanij. Arhiv vnutrennej meditsiny. 2015; 5: 19–21 [Belyalov F.I. Prognostic scores for cardiovascular diseases. The Russian Archives of Internal Medicine. 2015; 5: 19–21 (in Russ.)].
  6. Bojtsov S.A., Shal'nova S.A., Deev A.D. i dr. Modelirovanie riska razvitija serdechno-sosudistyh zabolevanij i ih oslozhnenij na individual'nom i gruppovom urovnjah. Ter. arh. 2013; 85 (9): 4–10 [Boitsov S.A., Shalnova S.A., Deev A.D. et al. Simulation of a risk for cardiovascular diseases and their events at individual and group levels. Therapeutic archive. 2013; 85 (9): 4–10 (in Russ.)].
  7. Weng S.F., Reps J., Kai J. et al. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017; 12 (4): e0174944. DOI: 10.1371/journal.pone.0174944
  8. Angraal S., Mortazavi B.J., Gupta A. et al. Machine Learning Prediction of Mortality and Hospitalization in Heart Failure With Preserved Ejection Fraction. JACC: Heart Failure. 2020; 8 (1): 12–21. https://doi.org/10.1016/j.jchf.2019.06.013
  9. Meyer A., Zverinski D., Pfahringer B. et al. Machine learning for real-time prediction of complications in critical care: a retrospective study. Lancet Respir. Med. 2018; 6 (12): 905–14. https://doi.org/10.1016/S2213-2600(18)30300-X
  10. Kuznetsova T., Novitskiy R., Gusev A. et al. Deep and machine learning models to improve risk prediction of cardiovascular disease using data extraction from electronic health records. Eur. Heart J. 2019; 40 (Suppl. 1): 1923–4. https://doi.org/10.1093/eurheartj/ehz748.0670
  11. Clinical Practice Research Datalink, reference number: CPRD00039761. Available at: https://www.cprd.com
  12. European guidelines on cardiovascular disease prevention in clinical practice: third joint task force of European and other societies on cardiovascular disease prevention in clinical practice (constituted by representatives of eight societies and by invited experts). Eur. J. Cardiovasc. Prev. Rehabil. 2003; 10 (4): 1–10. DOI: 10.1097/01.hjr.0000087913.96265.e2
  13. Conroy R.M., Pyorala K., Fitzgerald A.P. et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur. Heart J. 2003; 24: 987–1003. DOI: 10.1016/s0195-668x(03)00114-3
  14. Beunzaa J.-J., Puertasa E. et al. Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease). J. Biomed. Inform. 2019; 97: 103257. DOI:10.1016/j.jbi.2019.103257
  15. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006; 27 (8): 861–74. https://doi.org/10.1016/j.patrec.2005.10.010