Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2021

Abstract

Stock return prediction has been a hot topic in both research and industry given its potential for large financial gain. The return signal, apart from its inherent volatility and complexity, is often accompanied by a multitude of noises, such as other stocks’ performance, macroeconomic factors and financial news, etc. To better characterize these factors, we propose a new model that consists of two levels of sequence: an NLP-based module to capture the sequential nature of words and sentences in the financial news, and a time-series-based module to exploit the sequential nature of adjacent observations in the stock price. In this proposed framework, we employ Hierarchical Attention Networks (HAN) in the text mining module, which could effectively model the financial news and extract important signals at both word and sentence level. For the time series module, the established Long-Short Term Memory (LSTM) network is used to model the complex serial dependence in the time series data. We compare with benchmark models using either module alone, as well as other alternatives using the traditional Bag of Words (BOW) approach, based on the Dow Jones Industrial Average (DJIA) dataset. Experiment results show that our proposal method performs better in several classification metrics for both positive and negative stock returns.

Keywords

stock price prediction, text classification, natural language processing, hierarchical attention networks (HAN), long short-term memory (LSTM)

Discipline

Finance | Finance and Financial Management

Research Areas

Finance

Publication

Proceedings of the 2021 International Conference on Signal Processing and Machine Learning (CONF-SPML), Stanford, California, November 14

First Page

133

Last Page

138

ISBN

9781665417341

Identifier

10.1109/CONF-SPML54095.2021.00034

Publisher

Elsevier

City or Country

Stanford, CA

Share

COinS