Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2024
Abstract
Maritime risk research is crucial yet challenging for improving safety, efficiency, and sustainability in maritime operations. This paper presents an innovative method for automating the collection and identification of risk data related to global maritime risks from news sources, addressing the limitations of traditional manual methods. To evaluate the proposed method, different learning-based models, including conventional machine learning approaches and advanced Large Language Models (LLMs) such as GPT-4 and LLaMA-3.1, are comprehensively studied for comparison. In addition, not only do we use popular evaluation metrics to assess the proposed method, but we also introduce a new evaluation metric, called the "Ratio of Valid Categories (RVC)," to evaluate model reliability. The merits of the proposed method are demonstrated across different evaluation metrics.The research results show that the proposed LLM-based methods, particularly the GPT-4-based method, consistently outperform traditional models, significantly improving both the efficiency and accuracy of maritime risk data collection and identification. Our findings contribute to the expanding literature on LLM applications in risk management, demonstrating their potential to transform data collection and identification practices.
Keywords
Maritime Risk, Automated Data Collection, Risk Identification, Large Language Models, Traditional Machine Learning, GPT-4o, Llama-3.1, Ratio of Valid Categories (RVC)
Discipline
Artificial Intelligence and Robotics
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 2024 IEEE International Conference on Data Mining
Identifier
10.1109/ICDMW65004.2024.00061
Publisher
IEEE
City or Country
Piscataway, NJ, USA
Citation
HUANG, Donghao; FU, Xiuju; YIN, Xiaofeng; PEN, Haibo; and WANG, Zhaoxia.
Automating maritime risk data collection and identification leveraging large language models. (2024). Proceedings of the 2024 IEEE International Conference on Data Mining.
Available at: https://ink.library.smu.edu.sg/sis_research/10663
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICDMW65004.2024.00061