ECTracker-an efficient algorithm for haplotype analysis and classification

Lin L.
Wong L.
Tze-Yun LEONG, Singapore Management University
Lai P.


This work aims at discovering the genetic variations of hemophilia A patients through examining the combination of molecular haplotypes present in hemophilia A and normal local populations using data mining methods. Data mining methods that are capable of extracting understandable and expressive patterns and also capable of making predictions based on inferences made on the patterns were explored in this work. An algorithm known as ECTracker is proposed and its performance compared with some common data mining methods such as artificial neural network, support vector machine, naive Bayesian, and decision tree (C4.5). Experimental studies and analyses show that ECTracker has comparatively good predictive accuracies in classification when compared to methods that can only perform classification. At the same time, ECTracker is also capable of producing easily comprehensible and expressive patterns for analytical purposes by experts.