Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2005

Abstract

Informal language is actively used in network-mediated communication, e.g. chat room, BBS, email and text message. We refer the anomalous terms used in such context as network informal language (NIL) expressions. For example, “偶(ou3)” is used to replace “我(wo3)” in Chinese ICQ. Without unconventional resource, knowledge and techniques, the existing natural language processing approaches exhibit less effectiveness in dealing with NIL text. We propose to study NIL expressions with a NIL corpus and investigate techniques in processing NIL expressions. Two methods for Chinese NIL expression recognition are designed in NILER system. The experimental results show that pattern matching method produces higher precision and support vector machines method higher F-1 measure. These results are encouraging and justify our future research effort in NIL processing.

Discipline

Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Proceedings of the 4th SIGHAN workshop on Chinese Language Processing

City or Country

Jeju Island, Korea

Additional URL

http://aclweb.org/anthology/I05-3013

Share

COinS