Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
11-2005
Abstract
Traditional text mining systems employ shallow parsing techniques and focus on concept extraction and taxonomic relation extraction. This paper presents a novel system called CRCTOL for mining rich semantic knowledge in the form of ontology from domain-specific text documents. By using a full text parsing technique and incorporating both statistical and lexico-syntactic methods, the knowledge extracted by our system is more concise and contains a richer semantics compared with alternative systems. We conduct a case study wherein CRCTOL extracts ontological knowledge, specifically key concepts and semantic relations, from a terrorism domain text collection. Quantitative evaluation, by comparing with a state-of-the-art ontology learning system known as Text-To-Onto, has shown that CRCTOL produces much better precision and recall for both concept and relation extraction, especially from sentences with complex structures.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston Texas, USA, Nov 27-30
First Page
4
Last Page
7
Identifier
10.1109/ICDM.2005.97
Publisher
Institute of Electrical and Electronics Engineers Inc.
City or Country
New York
Citation
JIANG, Xing and TAN, Ah-hwee.
Mining ontological knowledge from domain-specific text documents. (2005). Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston Texas, USA, Nov 27-30. 4-7.
Available at: https://ink.library.smu.edu.sg/sis_research/6666
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
http://www.scopus.com/inward/record.url?eid=2-s2.0-34548548969&partnerID=MN8TOARS