ZTE Communications ›› 2026, Vol. 24 ›› Issue (1): 34-44.DOI: 10.12142/ZTECOM.202601006
• Special Topic • Previous Articles Next Articles
Zhao Jianchao1,, Geng Zhaosen1,, Li Zeyi2, Wang Pan3(
)
Received:2025-01-11
Online:2026-03-25
Published:2026-03-17
About author:Zhao Jianchao is with the Cable Products Business Department, ZTE Corporation. He is engaged in the development and delivery of wireline and cable network solutions, with a focus on product implementation, service integration, and system deployment. His work involves coordinating product requirements with network operations and supporting the deployment of large-scale wireline network services. His professional interests include wireline network solutions, service delivery optimization, and operational support for carrier networks.Supported by:Zhao Jianchao, Geng Zhaosen, Li Zeyi, Wang Pan. Key Technologies for AI-Driven Network Traffic Classification Workflow and Data Distribution Shift[J]. ZTE Communications, 2026, 24(1): 34-44.
Add to citation manager EndNote|Ris|BibTeX
URL: https://zte.magtechjournal.com/EN/10.12142/ZTECOM.202601006
| Academia | Industry | |
|---|---|---|
| Classification requirements | Modeling on training datasets for optimal performance | Comprehensive study of training/classification costs, efficiency, continuous operation, credibility, etc. |
| End-to-end TC | Focus on the modeling process before the TC model is deployed | Increased focus on monitoring optimization of TC models after deployment |
| Training data | Static/obsolete, well labeled, noise known | Continuous/changing, no/wrong labeling, noise unknown |
| Costs | Mostly unconcerned | Great concern |
| Data distribution shift | Mostly unconcerned | Great concern |
| Continuous learning | Mostly unconcerned | Great concern |
| Interpretability | Mostly unconcerned | Consideration |
| Computational complexity | Focus more on training fast | More concerned with reasoning fast |
Table 1 Comparison between academia and industry in AI-TC
| Academia | Industry | |
|---|---|---|
| Classification requirements | Modeling on training datasets for optimal performance | Comprehensive study of training/classification costs, efficiency, continuous operation, credibility, etc. |
| End-to-end TC | Focus on the modeling process before the TC model is deployed | Increased focus on monitoring optimization of TC models after deployment |
| Training data | Static/obsolete, well labeled, noise known | Continuous/changing, no/wrong labeling, noise unknown |
| Costs | Mostly unconcerned | Great concern |
| Data distribution shift | Mostly unconcerned | Great concern |
| Continuous learning | Mostly unconcerned | Great concern |
| Interpretability | Mostly unconcerned | Consideration |
| Computational complexity | Focus more on training fast | More concerned with reasoning fast |
| Flow Feature | Category | Description | Feature Calculation Method |
|---|---|---|---|
| Flow 5-tuple | Flow index | src/sp/dst/dp/protocol | Serialized regular preprocessing |
| TCP slide window | TCP window | TCP flow control parameters | Serialized regular preprocessing |
| TLS handshake packet information | TLS fingerprint | Handshake types, cipher suites, content types, key length, etc | Serialized regular preprocessing |
| Packet length sequence | Packet-related | A sequence of packet lengths in the stream. It may contain upstream, downstream, and bidirectional sequences as needed. | Packet length variance (max/min/ave/std) |
| Packet arrival times | A time sequence of packet arrivals in a diversion. Upstream, downstream and bidirectional sequences may be included as needed. | Packet time variance (max/min/ave/std) | |
| Flow length related | Flow-related | Total number of flow bytes per unit of time, which may include upstream, downstream, and bidirectional as needed. | Multi-flow length variance (max/min/ave/std) |
| Flow duration | TCP flow duration UDP flow duration can be increased if NP resources are sufficient. | Multi-flow duration variance (max/min/ave/std) |
Table 2 A typical collection of network traffic features (partial)
| Flow Feature | Category | Description | Feature Calculation Method |
|---|---|---|---|
| Flow 5-tuple | Flow index | src/sp/dst/dp/protocol | Serialized regular preprocessing |
| TCP slide window | TCP window | TCP flow control parameters | Serialized regular preprocessing |
| TLS handshake packet information | TLS fingerprint | Handshake types, cipher suites, content types, key length, etc | Serialized regular preprocessing |
| Packet length sequence | Packet-related | A sequence of packet lengths in the stream. It may contain upstream, downstream, and bidirectional sequences as needed. | Packet length variance (max/min/ave/std) |
| Packet arrival times | A time sequence of packet arrivals in a diversion. Upstream, downstream and bidirectional sequences may be included as needed. | Packet time variance (max/min/ave/std) | |
| Flow length related | Flow-related | Total number of flow bytes per unit of time, which may include upstream, downstream, and bidirectional as needed. | Multi-flow length variance (max/min/ave/std) |
| Flow duration | TCP flow duration UDP flow duration can be increased if NP resources are sufficient. | Multi-flow duration variance (max/min/ave/std) |
| App | Flow |
|---|---|
| QQmusic (Music) | 39 465 |
| LOL:Wild Rift (Game) | 19 841 |
| Naruto (Game) | 27 240 |
| Zhihu (Sociality) | 16 643 |
| Bilibili (Video) | 20 014 |
| Teamfight Tactics (Game) | 23 718 |
| IQiyi (Video) | 36 740 |
| Tiktok (Video) | 16 640 |
| Honor of Kings (Game) | 42 734 |
| Background (Log) | 30 000 |
Table 3 Dataset and the corresponding number of streams
| App | Flow |
|---|---|
| QQmusic (Music) | 39 465 |
| LOL:Wild Rift (Game) | 19 841 |
| Naruto (Game) | 27 240 |
| Zhihu (Sociality) | 16 643 |
| Bilibili (Video) | 20 014 |
| Teamfight Tactics (Game) | 23 718 |
| IQiyi (Video) | 36 740 |
| Tiktok (Video) | 16 640 |
| Honor of Kings (Game) | 42 734 |
| Background (Log) | 30 000 |
| Traffic at Time Point | Traffic at Time Point | |||||||
|---|---|---|---|---|---|---|---|---|
| precision | recall | f1-score | support | precision | recall | f1-score | support | |
| QQ Music | 0.802 | 0.826 | 0.814 | 7 978 | 0.829 | 0.787 | 0.807 | 7 978 |
| Background | 0.716 | 0.698 | 0.707 | 6 106 | 0.709 | 0.698 | 0.704 | 6 106 |
| Bilibili | 0.984 | 0.951 | 0.968 | 3 979 | 0.985 | 0.950 | 0.967 | 3 979 |
| Tiktok | 0.736 | 0.638 | 0.683 | 3 291 | 0.749 | 0.656 | 0.700 | 3 291 |
| Naruto | 0.732 | 0.880 | 0.799 | 5 472 | 0.688 | 0.897 | 0.779 | 5 472 |
| IQiyi | 0.859 | 0.871 | 0.865 | 7 300 | 0.861 | 0.843 | 0.852 | 7 300 |
| Honor of Kings | 0.865 | 0.812 | 0.838 | 8 653 | 0.841 | 0.819 | 0.830 | 8 653 |
| Zhihu | 0.800 | 0.791 | 0.795 | 3 250 | 0.841 | 0.767 | 0.802 | 3 250 |
| LOL: Wild Rift | 0.842 | 0.897 | 0.869 | 3 977 | 0.817 | 0.904 | 0.859 | 3 977 |
| Teamfight Tactics | 0.988 | 0.899 | 0.942 | 4 601 | 0.984 | 0.901 | 0.941 | 4 601 |
| Accuracy | 0.828 | 0.828 | 0.828 | 0.828 | 0.822 | 0.822 | 0.822 | 0.822 |
| Macro avg | 0.832 | 0.826 | 0.828 | 54 607 | 0.830 | 0.822 | 0.824 | 54 607 |
| Weighted avg | 0.831 | 0.828 | 0.828 | 54 607 | 0.827 | 0.822 | 0.822 | 54 607 |
Table 4 Classification results of network traffic at different times
| Traffic at Time Point | Traffic at Time Point | |||||||
|---|---|---|---|---|---|---|---|---|
| precision | recall | f1-score | support | precision | recall | f1-score | support | |
| QQ Music | 0.802 | 0.826 | 0.814 | 7 978 | 0.829 | 0.787 | 0.807 | 7 978 |
| Background | 0.716 | 0.698 | 0.707 | 6 106 | 0.709 | 0.698 | 0.704 | 6 106 |
| Bilibili | 0.984 | 0.951 | 0.968 | 3 979 | 0.985 | 0.950 | 0.967 | 3 979 |
| Tiktok | 0.736 | 0.638 | 0.683 | 3 291 | 0.749 | 0.656 | 0.700 | 3 291 |
| Naruto | 0.732 | 0.880 | 0.799 | 5 472 | 0.688 | 0.897 | 0.779 | 5 472 |
| IQiyi | 0.859 | 0.871 | 0.865 | 7 300 | 0.861 | 0.843 | 0.852 | 7 300 |
| Honor of Kings | 0.865 | 0.812 | 0.838 | 8 653 | 0.841 | 0.819 | 0.830 | 8 653 |
| Zhihu | 0.800 | 0.791 | 0.795 | 3 250 | 0.841 | 0.767 | 0.802 | 3 250 |
| LOL: Wild Rift | 0.842 | 0.897 | 0.869 | 3 977 | 0.817 | 0.904 | 0.859 | 3 977 |
| Teamfight Tactics | 0.988 | 0.899 | 0.942 | 4 601 | 0.984 | 0.901 | 0.941 | 4 601 |
| Accuracy | 0.828 | 0.828 | 0.828 | 0.828 | 0.822 | 0.822 | 0.822 | 0.822 |
| Macro avg | 0.832 | 0.826 | 0.828 | 54 607 | 0.830 | 0.822 | 0.824 | 54 607 |
| Weighted avg | 0.831 | 0.828 | 0.828 | 54 607 | 0.827 | 0.822 | 0.822 | 54 607 |
| Data | Methods | Drift Occurs | Distance (unitless) | Execution Time/s |
|---|---|---|---|---|
| One-month interval | Kolmogorov-Smirnov | √ | / | 0.062 |
| Maximum mean discrepancy | √ | 0.192 245 841 | 1.385 | |
| Chi-Squared | √ | / | 0.232 | |
| Cramérvon Mises | √ | / | 0.030 | |
| Least-squares density difference | √ | 0.239 282 097 | 0.398 | |
| Spot-the-diff | × | 0.551 293 545 | 1.293 | |
| Mixed-type tabular data | √ | / | 0.069 | |
| One-day interval | Kolmogorov-Smirnov | × | / | 0.064 |
| Maximum mean discrepancy | × | 0.000 526 399 | 1.449 | |
| Chi-Squared | × | / | 0.145 | |
| Cramérvon Mises | × | / | 0.025 | |
| Least-squares density difference | √ | 0.003 137 094 | 0.391 | |
| Spot-the-diff | × | 0.054 495 389 | 1.286 | |
| Mixed-type tabular data | × | / | 0.069 |
Table 5 Experimental results of data drift in response to different time periods
| Data | Methods | Drift Occurs | Distance (unitless) | Execution Time/s |
|---|---|---|---|---|
| One-month interval | Kolmogorov-Smirnov | √ | / | 0.062 |
| Maximum mean discrepancy | √ | 0.192 245 841 | 1.385 | |
| Chi-Squared | √ | / | 0.232 | |
| Cramérvon Mises | √ | / | 0.030 | |
| Least-squares density difference | √ | 0.239 282 097 | 0.398 | |
| Spot-the-diff | × | 0.551 293 545 | 1.293 | |
| Mixed-type tabular data | √ | / | 0.069 | |
| One-day interval | Kolmogorov-Smirnov | × | / | 0.064 |
| Maximum mean discrepancy | × | 0.000 526 399 | 1.449 | |
| Chi-Squared | × | / | 0.145 | |
| Cramérvon Mises | × | / | 0.025 | |
| Least-squares density difference | √ | 0.003 137 094 | 0.391 | |
| Spot-the-diff | × | 0.054 495 389 | 1.286 | |
| Mixed-type tabular data | × | / | 0.069 |
| [1] | Wang H Z, Liu J W. Research status and key technologies of network endogenous security [J]. ZTE technology journal, 2022, 167(6): 2–11. DOI: 10.12142/ZTETJ.202206002 |
| [2] | Lu H, Chen Y, Lou D. 5G/5G-Advanced/6G access network security technology evolution and endogenous security [J]. ZTE technology journal, 2022, 167(6): 85–94. DOI: 10.12142/ZTETJ.202206014 |
| [3] | Rezaei S, Liu X. Deep learning for encrypted traffic classification: an overview [J]. IEEE communications magazine, 2019, 57(5): 76–81. DOI: 10.1109/MCOM.2019.1800819 |
| [4] | Wang P, Chen X J, Ye F, et al. A survey of techniques for mobile service encrypted traffic classification using deep learning [J]. IEEE access, 2019, 7: 54024–54033. DOI: 10.1109/ACCESS.2019.2912787 |
| [5] | Aceto G, Ciuonzo D, Montieri A, et al. Mobile encrypted traffic classification using deep learning: experimental evaluation, lessons learned, and challenges [J]. IEEE transactions on network and service management, 2019, 16(2): 445–458. DOI: 10.1109/TNSM.2019.2899085 |
| [6] | Aceto G, Ciuonzo D, Montieri A, et al. Mobile encrypted traffic classification using deep learning [C]//Proceedings of Network Traffic Measurement and Analysis Conference (TMA). IEEE, 2018: 1–8. DOI: 10.23919/TMA.2018.8506563 |
| [7] | Wang P, Ye F, Chen X J, et al. Datanet: deep learning based encrypted network traffic classification in SDN home gateway [J]. IEEE access, 2018, 6: 55380–55391. DOI: 10.1109/ACCESS.2018.2872430 |
| [8] | Wang P, Li S H, Ye F, et al. PacketCGAN: exploratory study of class imbalance for encrypted traffic classification using CGAN [C]//International Conference on Communications (ICC). IEEE, 2020: 1–7. DOI: 10.1109/icc40277.2020.9148946 |
| [9] | Wang P, Wang Z X, Ye F, et al. ByteSGAN: a semi-supervised generative adversarial network for encrypted traffic classification in SDN Edge Gateway [J]. Computer networks, 2021, 200: 108535. DOI: 10.1016/j.comnet.2021.108535 |
| [10] | Wang Z X, Wang P, Zhou X K, et al. FLOWGAN: unbalanced network encrypted traffic identification method based on GAN [C]//IEEE International Conference on Big Data and Cloud Computing (BdCloud). IEEE, 2019: 975–983. DOI: 10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00141 |
| [11] | Wang Y, Wang P, Wang Z X, et al. Evaluation of feature selection on network traffic classification [C]//IEEE International Conference on Big Data and Cloud Computing (BdCloud). IEEE, 2021: 813–818. DOI: 10.1109/dasc-picom-cbdcom-cyberscitech52372.2021.00135 |
| [12] | Lipton Z, Wang Y X, Smola A. Detecting and correcting for label shift with black box predictors [C]//International conference on machine learning. PMLR, 2018: 3122–3130. DOI: 10.48550/arXiv.1802.03646 |
| [13] | Gretton A, Borgwardt K M, Rasch M J, et al. A kernel two-sample test [J]. The Journal of machine learning research, 2012, 13(1): 723–773. DOI: 10.5555/2188385.2188410 |
| [14] | Liu F, Xu W, Lu J, et al. Learning deep kernels for non-parametric two-sample tests [C]//International conference on machine learning. PMLR, 2020: 6316–6326. DOI: 10.48550/arXiv.2002.09116 |
| [15] | Zhang K, Schölkopf B, Muandet K, et al. Domain adaptation under target and conditional shift [C]//International conference on machine learning. PMLR, 2013: 819–827. DOI: 10.5555/3042817.3043028 |
| [16] | Zhao H, Des Combes R T, Zhang K, et al. On learning invariant representations for domain adaptation [C]//International conference on machine learning. PMLR, 2019: 7523–7532. DOI: 10.48550/arXiv.1905.12013 |
| [1] | LIU Yichen, GAO Ruixin, ZENG Chen, LIU Yingzhuang. A Transformer-Based End-to-End Receiver Design for Wi-Fi 7 Physical Layer [J]. ZTE Communications, 2025, 23(4): 27-36. |
| [2] | ZHANG Yang, CEN Zihan, ZHAN Wen, CHEN Xiang. C-WAN for FTTR: Enabling Low-Overhead Joint Transmission with Deep Learning [J]. ZTE Communications, 2025, 23(4): 65-76. |
| [3] | HE Shuai, LIU Limin, WANG Zhanli, LI Jinliang, MAO Xiaojun, MING Anlong. M+MNet: A Mixed-Precision Multibranch Network for Image Aesthetics Assessment [J]. ZTE Communications, 2025, 23(3): 96-110. |
| [4] | AI Bo, ZHANG Yuxin, YANG Mi, HE Ruisi, GUO Rongge. A Machine Learning-Based Channel Data Enhancement Platform for Digital Twin Channels [J]. ZTE Communications, 2025, 23(2): 20-30. |
| [5] | CHENG Jiaming, CHEN Wei, LI Lun, AI Bo. Efficient Spatio-Temporal Predictive Learning for Massive MIMO CSI Prediction [J]. ZTE Communications, 2025, 23(1): 3-10. |
| [6] | WANG Chongchong, LI Yao, WANG Beibei, CAO Hong, ZHANG Yanyong. Point Cloud Processing Methods for 3D Point Cloud Detection Tasks [J]. ZTE Communications, 2023, 21(4): 38-46. |
| [7] | GONG Panyin, ZHANG Guidong, ZHANG Zhigang, CHEN Xiao, DING Xuan. Research on Fall Detection System Based on Commercial Wi-Fi Devices [J]. ZTE Communications, 2023, 21(4): 60-68. |
| [8] | DENG Letian, ZHAO Yanru. Deep Learning-Based Semantic Feature Extraction: A Literature Review and Future Directions [J]. ZTE Communications, 2023, 21(2): 11-17. |
| [9] | LU Ping, SHENG Bin, SHI Wenzhe. Scene Visual Perception and AR Navigation Applications [J]. ZTE Communications, 2023, 21(1): 81-88. |
| [10] | FAN Guotian, WANG Zhibin. Intelligent Antenna Attitude Parameters Measurement Based on Deep Learning SSD Model [J]. ZTE Communications, 2022, 20(S1): 36-43. |
| [11] | GAO Zhengguang, LI Lun, WU Hao, TU Xuezhen, HAN Bingtao. A Unified Deep Learning Method for CSI Feedback in Massive MIMO Systems [J]. ZTE Communications, 2022, 20(4): 110-115. |
| [12] | ZHANG Jintao, HE Zhenqing, RUI Hua, XU Xiaojing. Spectrum Sensing for OFDMA Using Multicarrier Covariance Matrix Aware CNN [J]. ZTE Communications, 2022, 20(3): 61-69. |
| [13] | HE Hongye, YANG Zhiguo, CHEN Xiangning. Payload Encoding Representation from Transformer for Encrypted Traffic Classification [J]. ZTE Communications, 2021, 19(4): 90-97. |
| [14] | XUE Songyan, LI Ang, WANG Jinfei, YI Na, MA Yi, Rahim TAFAZOLLI, Terence DODGSON. To Learn or Not to Learn:Deep Learning Assisted Wireless Modem Design [J]. ZTE Communications, 2019, 17(4): 3-11. |
| [15] | ZHENG Xiaoqing, LU Yaping, PENG Haoyuan, FENG Jiangtao, ZHOU Yi, JIANG Min, MA Li, ZHANG Ji, JI Jie. Detecting Abnormal Start-Ups, Unusual Resource Consumptions of the Smart Phone: A Deep Learning Approach [J]. ZTE Communications, 2019, 17(2): 38-43. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||