Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

12-2024

Abstract

Maritime risk research is crucial yet challenging for improving safety, efficiency, and sustainability in maritime operations. This paper presents an innovative method for automating the collection and identification of risk data related to global maritime risks from news sources, addressing the limitations of traditional manual methods. To evaluate the proposed method, different learning-based models, including conventional machine learning approaches and advanced Large Language Models (LLMs) such as GPT-4 and LLaMA-3.1, are comprehensively studied for comparison. In addition, not only do we use popular evaluation metrics to assess the proposed method, but we also introduce a new evaluation metric, called the "Ratio of Valid Categories (RVC)," to evaluate model reliability. The merits of the proposed method are demonstrated across different evaluation metrics.The research results show that the proposed LLM-based methods, particularly the GPT-4-based method, consistently outperform traditional models, significantly improving both the efficiency and accuracy of maritime risk data collection and identification. Our findings contribute to the expanding literature on LLM applications in risk management, demonstrating their potential to transform data collection and identification practices.

Keywords

Maritime Risk, Automated Data Collection, Risk Identification, Large Language Models, Traditional Machine Learning, GPT-4o, Llama-3.1, Ratio of Valid Categories (RVC)

Discipline

Artificial Intelligence and Robotics

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 2024 IEEE International Conference on Data Mining

Identifier

10.1109/ICDMW65004.2024.00061

Publisher

IEEE

City or Country

Piscataway, NJ, USA

Additional URL

https://doi.org/10.1109/ICDMW65004.2024.00061

Share

COinS