Publication Type

Journal Article

Publication Date

4-2005

Abstract

Gene expression data generated by DNA microarray experiments have provided a vast resource for medical diagnosis and disease understanding. Most prior work in analyzing gene expression data, however, focuses on predictive performance but not so much on deriving human understandable knowledge. This paper presents a systematic approach for learning and extracting rule-based knowledge from gene expression data. A class of predictive self-organizing networks known as Adaptive Resonance Associative Map (ARAM) is used for modelling gene expression data, whose learned knowledge can be transformed into a set of symbolic IF-THEN rules for interpretation. For dimensionality reduction, we illustrate how the system can work with a variety of feature selection methods. Benchmark experiments conducted on two gene expression data sets from acute leukemia and colon tumor patients show that the proposed system consistently produces predictive performance comparable, if not superior, to all previously published results. More importantly, very simple rules can be discovered that have extremely high diagnostic power. The proposed methodology, consisting of dimensionality reduction, predictive modelling, and rule extraction, provides a promising approach to gene expression analysis and disease understanding.

Keywords

Knowledge discovery, Gene expression analysis, Predictive modelling, Rule extraction, Feature selection

Discipline

Databases and Information Systems | OS and Networks

Research Areas

Data Science and Engineering

Publication

Neural Networks

Volume

18

Issue

3

First Page

297

Last Page

306

ISSN

0893-6080

Identifier

10.1016/j.neunet.2005.01.003

Publisher

Elsevier

Additional URL

https://doi.org/10.1016/j.neunet.2005.01.003

Share

COinS