ZTE Communications ›› 2023, Vol. 21 ›› Issue (3): 70-76.DOI: 10.12142/ZTECOM.202303010
• Research Papers • Previous Articles Next Articles
JI Yuhe1, HAN Jing2(), ZHAO Yongxin1, ZHANG Shenglin1, GONG Zican2
Received:
2022-12-08
Online:
2023-09-21
Published:
2023-03-22
About author:
JI Yuhe received his bachelor’s degree in software engineering from the College of Software, Nankai University, China in 2022. He is now pursuing his master’s degree at the School of Software, Nankai University. His research interests include anomaly detection and natural language processing.|HAN Jing (JI Yuhe, HAN Jing, ZHAO Yongxin, ZHANG Shenglin, GONG Zican. Log Anomaly Detection Through GPT-2 for Large Scale Systems[J]. ZTE Communications, 2023, 21(3): 70-76.
Add to citation manager EndNote|Ris|BibTeX
URL: http://zte.magtechjournal.com/EN/10.12142/ZTECOM.202303010
Templates | Euclidean Distance |
---|---|
httprequest except <*> permission denied httprequest except <*> <*> permission denied | - 0.147 629 340 284 133 4 |
httprequest except <*> no such file or directory | 0.595 852 332 701 891 4 |
httprequest except <*> | 0.621 201 472 867 456 3 |
httprequest except EoF occurred in violation of protocol | 0.838 852 193 154 771 3 |
httprequest except <*> connection reset by peer | 0.880 359 580 380 884 6 |
Table 1 Euclidean distance of sentence vectors of similar semantic templates
Templates | Euclidean Distance |
---|---|
httprequest except <*> permission denied httprequest except <*> <*> permission denied | - 0.147 629 340 284 133 4 |
httprequest except <*> no such file or directory | 0.595 852 332 701 891 4 |
httprequest except <*> | 0.621 201 472 867 456 3 |
httprequest except EoF occurred in violation of protocol | 0.838 852 193 154 771 3 |
httprequest except <*> connection reset by peer | 0.880 359 580 380 884 6 |
Dataset | Training Data | Number of Templates | Test Dataset | |
---|---|---|---|---|
Normal | Anomalous | |||
Ada | 6 626 865 | 599 | 7 911 944 | 2 648 |
Bob | 7 021 577 | 84 | 1 067 850 | 904 |
Table 2 Statistics of evaluation datasets
Dataset | Training Data | Number of Templates | Test Dataset | |
---|---|---|---|---|
Normal | Anomalous | |||
Ada | 6 626 865 | 599 | 7 911 944 | 2 648 |
Bob | 7 021 577 | 84 | 1 067 850 | 904 |
Approach | Ada | Bob | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1S | Precision | Recall | F1S | |
LogAnomaly | 0.394 | 0.190 | 0.256 | 0.353 | 0.332 | 0.342 |
NeuralLog | 0.297 | 0.354 | 0.323 | 0.638 | 0.872 | 0.736 |
Our method | 0.738 | 1.00 | 0.850 | 0.857 | 1.00 | 0.923 |
Table 3 Evaluation results of our method vs the other two methods
Approach | Ada | Bob | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1S | Precision | Recall | F1S | |
LogAnomaly | 0.394 | 0.190 | 0.256 | 0.353 | 0.332 | 0.342 |
NeuralLog | 0.297 | 0.354 | 0.323 | 0.638 | 0.872 | 0.736 |
Our method | 0.738 | 1.00 | 0.850 | 0.857 | 1.00 | 0.923 |
Approach | Ada | Bob | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1S | Precision | Recall | F1S | |
OM w/o SV & AS | 0.128 | 0.835 | 0.222 | 0.510 | 0.940 | 0.661 |
OM w/o AS | 0.427 | 1.00 | 0.598 | 0.718 | 1 | 0.836 |
OM w/o SV | 0.627 | 0.807 | 0.705 | 0.833 | 0.940 | 0.883 |
OM | 0.738 | 1.00 | 0.850 | 0.857 | 1.00 | 0.923 |
Table 4 Experimental results
Approach | Ada | Bob | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1S | Precision | Recall | F1S | |
OM w/o SV & AS | 0.128 | 0.835 | 0.222 | 0.510 | 0.940 | 0.661 |
OM w/o AS | 0.427 | 1.00 | 0.598 | 0.718 | 1 | 0.836 |
OM w/o SV | 0.627 | 0.807 | 0.705 | 0.833 | 0.940 | 0.883 |
OM | 0.738 | 1.00 | 0.850 | 0.857 | 1.00 | 0.923 |
1 |
ZHANG S L, LIU Y, PEI D, et al. Rapid and robust impact assessment of software changes in large Internet-based services [C]//The 11th ACM Conference on Emerging Networking Experiments and Technologies. ACM, 2015: 1–13. DOI: 10.1145/2716281.2836087
DOI URL |
2 |
ZHU J M, HE S L, LIU J Y, et al. Tools and benchmarks for automated log parsing [C]//IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 2019: 121–130. DOI: 10.1109/ICSE-SEIP.2019.00021
DOI URL |
3 |
DU M, LI F F, ZHENG G N, et al. DeepLog: anomaly detection and diagnosis from system logs through deep learning [C]//ACM SIGSAC Conference on Computer and Communications Security. ACM, 2017: 1285–1298. DOI: 10.1145/3133956.3134015
DOI URL |
4 | MENG W B, LIU Y, ZHU Y C, et al. LogAnomaly: unsupervised detection of sequential and quantitative anomalies in unstructured logs[C]//The 28th International Joint Conference on Artificial Intelligence. ACM, 2019, 19(7): 4739–4745 |
5 |
ZHANG X, XU Y, LIN Q W, et al. Robust log-based anomaly detection on unstable log data [C]//The 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 2019: 807–817. DOI: 10.1145/3338906.3338931
DOI URL |
6 |
EKELHART A, EKAPUTRA F J, KIESLING E. The SLOGERT framework for automated log knowledge graph construction [C]//European Semantic Web Conference. ESWC, 2021: 631–646. DOI: 10.1007/978-3-030-77385-4_38
DOI URL |
7 |
GUO H X, YUAN S H, WU X T. LogBERT: log anomaly detection via BERT [C]//Proceedings of 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021: 1–8. DOI: 10.1109/IJCNN52387.2021.9534113
DOI URL |
8 |
LE V H, ZHANG H Y. Log-based anomaly detection without log parsing [C]//The 36th IEEE/ACM International Conference on Automated Software Engineering. ACM, 2021: 492–504. DOI: 10.1109/ASE51524.2021.9678773
DOI URL |
9 |
HE S L, HE P J, CHEN Z B, et al. A survey on automated log analysis for reliability engineering [J]. ACM computing surveys, 54(6): 1–37. DOI: 10.1145/3460345
DOI URL |
10 | RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners [EB/OL]. [2023-03-10]. |
11 |
HE P J, ZHU J M, ZHENG Z B, et al. Drain: an online log parsing approach with fixed depth tree [C]//IEEE International Conference on Web Services (ICWS). IEEE, 2017: 33–40. DOI: 10.1109/ICWS.2017.13
DOI URL |
12 |
REIMERS N, GUREVYCH I. Sentence-BERT: sentence embeddings using Siamese BERT-networks [C]//Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2019: 3982–3992. DOI: 10.18653/v1/d19-1410
DOI URL |
13 | HASHEMI S, MÄNTYLÄ M. OneLog: towards end-to-end training in software log anomaly detection [EB/OL]. [2022-12-12]. |
14 |
CHEN R, ZHANG S L, LI D W, et al. LogTransfer: cross-system log anomaly detection for software systems with transfer learning [C]//IEEE 31st International Symposium on Software Reliability Engineering. IEEE, 2020: 37–47. DOI: 10.1109/ISSRE5003.2020.00013
DOI URL |
15 |
HUANG S H, LIU Y, FUNG C, et al. HitAnomaly: Hierarchical transformers for anomaly detection in system log [J]. IEEE transactions on network and service management, 2020, 17(4): 2064–2076. DOI: 10.1109/TNSM.2020.3034647
DOI URL |
16 | YANG H T, ZHAO X, SUN D G, et al. Sprelog: log-based anomaly detection with self-matching networks and pre-trained models [C]//International Conference on Service-Oriented Computing. 2021: 736–743 |
17 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 6000–6010. DOI: 10.5555/3295222.3295349
DOI URL |
18 | LE V H, ZHANG H Y. Log-based anomaly detection with deep learning: how far are we? [C]//IEEE/ACM 44th International Conference on Software Engineering (ICSE). IEEE, 2022: 1356–1367 |
[1] | TANG Yuanqi, ZHANG Huimin, ZHENG Zheng, LI Ping, ZHU Yu. Hybrid Architecture and Beamforming Optimization for Millimeter Wave Systems [J]. ZTE Communications, 2023, 21(3): 93-104. |
[2] | Shuangfeng Han, Chih-Lin I, Zhikun Xu, Qi Sun, Haibin Li. Energy-Efficient Large-Scale Antenna Systems with Hybrid Digital-Analog Beamforming Structure [J]. ZTE Communications, 2015, 13(1): 28-34. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||