Abstract
Early detection of people at risk of heart disease is vital to prevent disease progression. Recently, machine learning methods have been successfully applied to predict heart disease risk, and research has shown that the input data tends to influence the performance of classification algorithms. This paper proposes a multistage deep learning approach to achieve improved prediction of heart disease risk. Firstly, an enhanced stacked sparse autoencoder is developed to achieve efficient feature learning and obtain relevant data for machine learning. Secondly, an optimized softmax classifier is applied to classify the learned features. Meanwhile, the multilayer architecture of autoencoders usually leads to internal covariate shift, a problem that affects the generalization ability of the network; hence, batch normalization is introduced to prevent this problem. The experimental results show that heart disease risk can be predicted effectively by the proposed method, which achieved a classification accuracy of 0.927 and 0.916 on the Framingham and Cleveland heart disease datasets, respectively, thereby outperforming other machine learning methods and similar studies.