Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2005

Abstract

Traditional text mining systems employ shallow parsing techniques and focus on concept extraction and taxonomic relation extraction. This paper presents a novel system called CRCTOL for mining rich semantic knowledge in the form of ontology from domain-specific text documents. By using a full text parsing technique and incorporating both statistical and lexico-syntactic methods, the knowledge extracted by our system is more concise and contains a richer semantics compared with alternative systems. We conduct a case study wherein CRCTOL extracts ontological knowledge, specifically key concepts and semantic relations, from a terrorism domain text collection. Quantitative evaluation, by comparing with a state-of-the-art ontology learning system known as Text-To-Onto, has shown that CRCTOL produces much better precision and recall for both concept and relation extraction, especially from sentences with complex structures.

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston Texas, USA, Nov 27-30

First Page

4

Last Page

7

Identifier

10.1109/ICDM.2005.97

Publisher

Institute of Electrical and Electronics Engineers Inc.

City or Country

New York

Additional URL

http://www.scopus.com/inward/record.url?eid=2-s2.0-34548548969&partnerID=MN8TOARS

Share

COinS