• Quick Search
  • Citation Search
  • Advanced Search
ZTE
  • Home
  • About Journal
    • Aims and Scopes
    • Publishing and Copyright Information
    • Honors
    • Peer Review
  • Announcement
    • Publication Ethics and Malpractice Statement
    • Declaration of Interests
    • Open Access
    • Copyright Transfer Agreement
  • Editorial Board
  • Guide for Authors
    • Submission Guidelines
    • Manuscript Template
    • Author Fees
  • Archives
  • Contact Us
  • 中文
CfP: AI-Agent Communication Network (ACN): Ar...
Special Topics of ZTE Communications 2026
CfP: Reconfigurable Antenna Systems for Next...
ZTE Communications has been indexed in Scopus
Special Topics of ZTE Communications 2025
Previous Next
  • Recommended Articles
  • Current Issue
  • Archive
11 September 2025, Volume 23 Issue 3
Previous Issue   
Download the whole issue (PDF)
The whole issue of ZTE Communications September 2025, Vol. 23 No.3
2025, 23(3):  0. 
Asbtract ( )   PDF (20152KB) ( )  
Related Articles | Metrics
Special Topic
Special Topic on Security of Large Models
2025, 23(3):  1-2.  doi:10.12142/ZTECOM.202503001
Asbtract ( )   HTML ( )   PDF (343KB) ( )  
References | Related Articles | Metrics
Poison-Only and Targeted Backdoor Attack Against Visual Object Tracking
GU Wei, SHAO Shuo, ZHOU Lingtao, QIN Zhan, REN Kui
2025, 23(3):  3-14.  doi:10.12142/ZTECOM.202503002
Asbtract ( )   HTML ( )   PDF (1597KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Visual object tracking (VOT), aiming to track a target object in a continuous video, is a fundamental and critical task in computer vision. However, the reliance on third-party resources (e.g., dataset) for training poses concealed threats to the security of VOT models. In this paper, we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack, where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data. Specifically, we first define and formulate three different variants of the targeted attacks: size-manipulation, trajectory-manipulation, and hybrid attacks. To implement these, we introduce Random Video Poisoning (RVP), a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences. Extensive experiments demonstrate that RVP effectively injects controllable backdoors, enabling precise manipulation of tracking behavior upon trigger activation, while maintaining high performance on benign data, thus ensuring stealth. Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses, such as dataset watermarking for copyright protection.

VOTI: Jailbreaking Vision-Language Models via Visual Obfuscation and Task Induction
ZHU Yifan, CHU Zhixuan, REN Kui
2025, 23(3):  15-26.  doi:10.12142/ZTECOM.202503003
Asbtract ( )   HTML ( )   PDF (6551KB) ( )  
Figures and Tables | References | Supplementary Material | Related Articles | Metrics

In recent years, large vision-language models (VLMs) have achieved significant breakthroughs in cross-modal understanding and generation. However, the safety issues arising from their multimodal interactions become prominent. VLMs are vulnerable to jailbreak attacks, where attackers craft carefully designed prompts to bypass safety mechanisms, leading them to generate harmful content. To address this, we investigate the alignment between visual inputs and task execution, uncovering locality defects and attention biases in VLMs. Based on these findings, we propose VOTI, a novel jailbreak framework leveraging visual obfuscation and task induction. VOTI subtly embeds malicious keywords within neutral image layouts to evade detection, and breaks down harmful queries into a sequence of subtasks. This approach disperses malicious intent across modalities, exploiting VLMs’ over-reliance on local visual cues and their fragility in multi-step reasoning to bypass global safety mechanisms. Implemented as an automated framework, VOTI integrates large language models as red-team assistants to generate and iteratively optimize jailbreak strategies. Extensive experiments across seven mainstream VLMs demonstrate VOTI’s effectiveness, achieving a 73.46% attack success rate on GPT-4o-mini. These results reveal critical vulnerabilities in VLMs, highlighting the urgent need for improving robust defenses and multimodal alignment.

From Function Calls to MCPs for Securing AI Agent Systems: Architecture, Challenges and Countermeasures
WANG Wei, LI Shaofeng, DONG Tian, MENG Yan, ZHU Haojin
2025, 23(3):  27-37.  doi:10.12142/ZTECOM.202503004
Asbtract ( )   HTML ( )   PDF (1290KB) ( )  
Figures and Tables | References | Related Articles | Metrics

With the widespread deployment of large language models (LLMs) in complex and multimodal scenarios, there is a growing demand for secure and standardized integration of external tools and data sources. The Model Context Protocol (MCP), proposed by Anthropic in late 2024, has emerged as a promising framework. Designed to standardize the interaction between LLMs and their external environments, it serves as a “USB-C interface for AI”. While MCP has been rapidly adopted in the industry, systematic academic studies on its security implications remain scarce. This paper presents a comprehensive review of MCP from a security perspective. We begin by analyzing the architecture and workflow of MCP and identify potential security vulnerabilities across key stages including input processing, decision-making, client invocation, server response, and response generation. We then categorize and assess existing defense mechanisms. In addition, we design a real-world attack experiment to demonstrate the feasibility of tool description injection within an actual MCP environment. Based on the experimental results, we further highlight underexplored threat surfaces and propose future directions for securing AI agent systems powered by MCP. This paper aims to provide a structured reference framework for researchers and developers seeking to balance functionality and security in MCP-based systems.

Dataset Copyright Auditing for Large Models: Fundamentals, Open Problems, and Future Directions
DU Linkang, SU Zhou, YU Xinyi
2025, 23(3):  38-47.  doi:10.12142/ZTECOM.202503005
Asbtract ( )   HTML ( )   PDF (511KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The unprecedented scale of large models, such as large language models (LLMs) and text-to-image diffusion models, has raised critical concerns about the unauthorized use of copyrighted data during model training. These concerns have spurred a growing demand for dataset copyright auditing techniques, which aim to detect and verify potential infringements in the training data of commercial AI systems. This paper presents a survey of existing auditing solutions, categorizing them across key dimensions: data modality, model training stage, data overlap scenarios, and model access levels. We highlight major trends, including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training. Through an in-depth analysis of 12 representative works, we extract four key observations that reveal the limitations of current methods. Furthermore, we identify three open challenges and propose future directions for robust, multimodal, and scalable auditing solutions. Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings.

StegoAgent: A Generative Steganography Framework Based on GUI Agents
SHEN Qiuhong, YANG Zijin, JIANG Jun, ZHANG Weiming, CHEN Kejiang
2025, 23(3):  48-58.  doi:10.12142/ZTECOM.202503006
Asbtract ( )   HTML ( )   PDF (1144KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Steganography is a technology that discreetly embeds secret information into the redundant space of a carrier, enabling covert communication. As generative models continue to advance, steganography has evolved from traditional modification-based methods to generative steganography, which includes generative linguistic and image based forms. However, while large model agents are rapidly emerging, no method has exploited the stable redundant space in their action processes. Inspired by this insightful observation, we propose a steganographic method leveraging large model agents, employing their actions to conceal secret messages. In this paper, we introduce StegoAgent, a generative steganography framework based on graphical user interface (GUI) agents, which effectively demonstrates the remarkable potential and effectiveness of large model agent-based steganographic methods.

Review
Analysis of Feasible Solutions for Railway 5G Network Security Assessment
XU Hang, SUN Bin, DING Jianwen, WANG Wei
2025, 23(3):  59-70.  doi:10.12142/ZTECOM.202503007
Asbtract ( )   HTML ( )   PDF (502KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The Fifth Generation of Mobile Communications for Railways (5G-R) brings significant opportunities for the rail industry. However, alongside the potential and benefits of the railway 5G network are complex security challenges. Ensuring the security and reliability of railway 5G networks is therefore essential. This paper presents a detailed examination of security assessment techniques for railway 5G networks, focusing on addressing the unique security challenges in this field. In this paper, various security requirements in railway 5G networks are analyzed, and specific processes and methods for conducting comprehensive security risk assessments are presented. This study provides a framework for securing railway 5G network development and ensuring its long-term sustainability.

Key Techniques and Challenges in NeRF-Based Dynamic 3D Reconstruction
LU Ping, FENG Daquan, SHI Wenzhe, LI Wan, LIN Jiaxin
2025, 23(3):  71-80.  doi:10.12142/ZTECOM.202503008
Asbtract ( )   HTML ( )   PDF (675KB) ( )  
Figures and Tables | References | Related Articles | Metrics

This paper explores the key techniques and challenges in dynamic scene reconstruction with neural radiance fields (NeRF). As an emerging computer vision method, the NeRF has wide application potential, especially in excelling at 3D reconstruction. We first introduce the basic principles and working mechanisms of NeRFs, followed by an in-depth discussion of the technical challenges faced by 3D reconstruction in dynamic scenes, including problems in perspective and illumination changes of moving objects, recognition and modeling of dynamic objects, real-time requirements, data acquisition and calibration, motion estimation, and evaluation mechanisms. We also summarize current state-of-the-art approaches to address these challenges, as well as future research trends. The goal is to provide researchers with an in-depth understanding of the application of NeRFs in dynamic scene reconstruction, as well as insights into the key issues faced and future directions.

Research Papers
Real-Time 7-Core SDM Transmission System Using Commercial 400 Gbit/s OTN Transceivers and Network Management System
CUI Jian, GU Ninglun, CHANG Cheng, SHI Hu, YAN Baoluo
2025, 23(3):  81-88.  doi:10.12142/ZTECOM.202503009
Asbtract ( )   HTML ( )   PDF (3531KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Space-division multiplexing (SDM) utilizing uncoupled multi-core fibers (MCF) is considered a promising candidate for next-generation high-speed optical transmission systems due to its huge capacity and low inter-core crosstalk. In this paper, we demonstrate a real-time high-speed SDM transmission system over a field-deployed 7-core MCF cable using commercial 400 Gbit/s backbone optical transport network (OTN) transceivers and a network management system. The transceivers employ a high noise-tolerant quadrature phase shift keying (QPSK) modulation format with a 130 Gbaud rate, enabled by optoelectronic multi-chip module (OE-MCM) packaging. The network management system can effectively manage and monitor the performance of the 7-core SDM OTN system and promptly report failure events through alarms. Our field trial demonstrates the compatibility of uncoupled MCF with high-speed OTN transmission equipment and network management systems, supporting its future deployment in next-generation high-speed terrestrial cable transmission networks.

Antenna Parameter Calibration for Mobile Communication Base Station via Laser Tracker
LI Junqiang, CHEN Shijun, FENG Yujie, FAN Jiancun, CHEN Qiang
2025, 23(3):  89-95.  doi:10.12142/ZTECOM.202503010
Asbtract ( )   HTML ( )   PDF (1386KB) ( )  
Figures and Tables | References | Related Articles | Metrics

In the field of antenna engineering parameter calibration for indoor communication base stations, traditional methods suffer from issues such as low efficiency, poor accuracy, and limited applicability to indoor scenarios. To address these problems, a high-precision and high-efficiency indoor base station parameter calibration method based on laser measurement is proposed. We use a high-precision laser tracker to measure and determine the coordinate system transformation relationship, and further obtain the coordinates and attitude of the base station. In addition, we propose a simple calibration method based on point cloud fitting for specific scenes. Simulation results show that using common commercial laser trackers, we can achieve a coordinate correction accuracy of 1 cm and an angle correction accuracy of 0.25°, which is sufficient to meet the needs of wireless positioning.

M+MNet: A Mixed-Precision Multibranch Network for Image Aesthetics Assessment
HE Shuai, LIU Limin, WANG Zhanli, LI Jinliang, MAO Xiaojun, MING Anlong
2025, 23(3):  96-110.  doi:10.12142/ZTECOM.202503011
Asbtract ( )   HTML ( )   PDF (4795KB) ( )  
Figures and Tables | References | Related Articles | Metrics

We propose Mixed-Precision Multibranch Network (M+MNet) to compensate for the neglect of background information in image aesthetics assessment (IAA) while providing strategies for overcoming the dilemma between training costs and performance. First, two exponentially weighted pooling methods are used to selectively boost the extraction of background and salient information during downsampling. Second, we propose Corner Grid, an unsupervised data augmentation method that leverages the diffusive characteristics of convolution to force the network to seek more relevant background information. Third, we perform mixed-precision training by switching the precision format, thus significantly reducing the time and memory consumption of data representation and transmission. Most of our methods specifically designed for IAA tasks have demonstrated generalizability to other IAA works. For performance verification, we develop a large-scale benchmark (the most comprehensive thus far) by comparing 17 methods with M+MNet on two representative datasets: the Aesthetic Visual Analysis (AVA) dataset and FLICKR-Aesthetic Evaluation Subset (FLICKR-AES). M+MNet achieves state-of-the-art performance on all tasks.

Current Issue

Volume 23 Issue 3
Announcement
More>>
  • Call for Paper: AI-Agent Communication Network: Architecture, Protocols and Key Technologies (2025-08-22)
  • Special Topics of ZTE Communications 2026 (2025-08-22)
  • Special Topics of ZTE Communications 2025 (2024-07-01)
  • References of "ZTE Communications" (in IEEE format) (2024-01-02)
 
Most Download
More>>
Most Read
More>>
Links
More>>
  • Links
Total visitors: Today:
Share: Facebook  Twitter  Wechat  Weibo  Email
Copyright © ZTE Communications
ISSN 1673-5188
CN 34-1294/TN
Powered by Beijing Magtech Co., Ltd.