Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2007

Abstract

Coronary artery disease (CAD) is a main cause of death in the world. Finding cost-effective methods to predict CAD is a major challenge in public health. In this paper, we investigate the combined effects of genetic polymorphisms and non-genetic factors on predicting the risk of CAD by applying well known classification methods, such as Bayesian networks, naïve Bayes, support vector machine, k-nearest neighbor, neural networks and decision trees. Our experiments show that all these classifiers are comparable in terms of accuracy, while Bayesian networks have the additional advantage of being able to provide insights into the relationships among the variables. We observe that the learned Bayesian Networks identify many important dependency relationships among genetic variables, which can be verified with domain knowledge. Conforming to current domain understanding, our results indicate that related diseases (e.g., diabetes and hypertension), age and smoking status are the most important factors for CAD prediction, while the genetic polymorphisms entail more complicated influences. © 2007 The authors. All rights reserved.

Keywords

Bayesian networks, Coronary artery disease, Data mining, Machine learning, Single nucleotide polymorphisms

Discipline

Computer Sciences | Health Information Technology

Research Areas

Intelligent Systems and Optimization

Publication

MEDINFO 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics

Volume

129

First Page

1219

Last Page

1224

ISBN

9781586037741

Publisher

IOS Press

City or Country

Amsterdam

Copyright Owner and License

Authors

Additional URL

https://www.ncbi.nlm.nih.gov/pubmed/17911909

Share

COinS