Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
4-2011
Abstract
The presence of noise or errors in the stated feature values of biomedical data can lead to incorrect prediction. We introduce a Bayesian Network-based Noise Correction framework named BN-NC. After data preprocessing, a Bayesian Network (BN) is learned to capture the feature dependencies. Using the BN to predict each feature in turn, BN-NC estimates a feature's error rate as the deviation between its predicted and stated values in the training data, and allocates the appropriate uncertainty to its subsequent findings during prediction. BN-NC automatically generates a probabilistic rule to explain BN prediction on the class variable using the feature values in its Markov blanket, and this is reapplied as necessary to explain the noise correction on those features. Using three real-life benchmark biomedical data sets (on HIV-1 drug resistance prediction and leukemia subtype classification), we demonstrate that BN-NC (1) accurately detects the errors in biomedical feature values, (2) automatically corrects for the errors to maintain higher prediction accuracy over competing methods including Decision Trees, Naive Bayes and Support Vector Machines, and (3) generates probabilistic rules that concisely explain the prediction and noise correction decisions. In addition to achieving more robust biomedical prediction in the presence of feature noise, by highlighting erroneous features and explaining their corrections, BN-NC provides medical researchers with high utility insights to biomedical data not found in other methods.
Discipline
Biomedical Engineering and Bioengineering | Databases and Information Systems
Publication
11th SIAM International Conference on Data Mining 2011: Mesa, Arizona, USA, 28-30 April 2011: Proceedings
First Page
71
Last Page
82
ISBN
9781617829802
Identifier
10.1137/1.9781611972818.7
Publisher
SIAM
City or Country
Philadelphia, PA
Embargo Period
7-10-2017
Citation
YAP, Ghim-Eng; TAN, Ah-Hwee; and PANG, Hwee Hwa.
Learning feature dependencies for noise correction in biomedical prediction. (2011). 11th SIAM International Conference on Data Mining 2011: Mesa, Arizona, USA, 28-30 April 2011: Proceedings. 71-82.
Available at: https://ink.library.smu.edu.sg/sis_research/3661
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
http://doi.org/10.1137/1.9781611972818.7
Included in
Biomedical Engineering and Bioengineering Commons, Databases and Information Systems Commons