Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
5-2023
Abstract
Logs, being run-time information automatically generated by software, record system events and activities with their timestamps. Before obtaining more insights into the run-time status of the software, a fundamental step of log analysis, called log parsing, is employed to extract structured templates and parameters from the semi-structured raw log messages. However, current log parsers are all syntax-based and regard each message as a character string, ignoring the semantic information included in parameters and templates.Thus, we propose the first semantic-based parser SemParser to unlock the critical bottleneck of mining semantics from log messages. It contains two steps, an end-to-end semantics miner and a joint parser. Specifically, the first step aims to identify explicit semantics inside a single log, and the second step is responsible for jointly inferring implicit semantics and computing structural outputs according to the contextual knowledge base of the logs. To analyze the effectiveness of our semantic parser, we first demonstrate that it can derive rich semantics from log messages collected from six widely-applied systems with an average F1 score of 0.985. Then, we conduct two representative downstream tasks, showing that current downstream models improve their performance with appropriately extracted semantics by 1.2%-11.7% and 8.65% on two anomaly detection datasets and a failure identification dataset, respectively. We believe these findings provide insights into semantically understanding log messages for the log analysis community.
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
ICSE '23: Proceedings of the 45th International Conference on Software Engineering, Melbourne, Australia, May 14-20
First Page
881
Last Page
893
Identifier
10.1109/ICSE48619.2023.00082
Publisher
ACM
City or Country
New York
Citation
HUO, Yintong; SU, Yuxin; LEE, Cheryl; and LYU, R. Michael.
SemParser: A semantic parser for log analytics. (2023). ICSE '23: Proceedings of the 45th International Conference on Software Engineering, Melbourne, Australia, May 14-20. 881-893.
Available at: https://ink.library.smu.edu.sg/sis_research/10673
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICSE48619.2023.00082