Legal topic classification: A comparative study of text classifiers on Singapore Supreme Court judgments

Jerrold Tsin Howe SOH, Singapore Management University
How Khang LIM, Singapore Management University
Ian Ernst CHAI, Attorney's-General Chambers, Singapore

Abstract

Duplicate record, see https://ink.library.smu.edu.sg/sol_research/2956. This paper conducts a comparative study on the performance of various machine learning approaches for classifying judgments into legal areas. Using a novel dataset of 6,227 Singapore Supreme Court judgments, we investigate how state-of-the-art NLP methods compare against traditional statistical models when applied to a legal corpus that comprised few but lengthy documents. All approaches tested, including topic model, word embedding, and language model-based classifiers, performed well with as little as a few hundred judgments. However, more work needs to be done to optimize state-of-the-art methods for the legal domain.