Search Result

Select

Modern Graphics APIs: Design Principles, A Use Case, and New Perspectives

Lu Ping, Sun Qi, Wang Chen, Guo Jie, Guo Yanwen, Shi Wenzhe

ZTE Communications 2026, 24 (1): 97-106. DOI: 10.12142/ZTECOM.202601013

Abstract （2）

HTML （0）

PDF （2369KB）（0）

Save

In this paper, we provide a comprehensive examination of the evolution of graphics Application Programming Interfaces (APIs). We begin by exploring traditional graphics APIs, elucidating their distinct features and inherent challenges. This sets the stage for a detailed exploration of modern graphics APIs, with a focus on four critical design principles. These principles are further analyzed through specific case studies and categorical examinations. The paper then introduces MoerEngine, a bespoke rendering engine, as a practical case to demonstrate the real-world application of these modern principles in software engineering. In conclusion, the study offers insights into the potential future trajectory of graphics APIs, spotlighting emerging design patterns and technological innovations. It also ventures to predict the development trends and capabilities of next-generation graphics APIs.

Table and Figures | Reference | Related Articles | Metrics

Select

AED-NeRF: Audio-Driven and Emotion- Editing Dynamic Neural Radiance Fields for Expressive Talking Face Avatar

Lu Ping, Song Li, Shi Wenzhe, Lin Zonghao, Ling Jun

ZTE Communications 2026, 24 (1): 72-80. DOI: 10.12142/ZTECOM.202601010

Abstract （4）

HTML （0）

PDF （2799KB）（0）

Save

While neural radiance field (NeRF) methods have shown promising results in generating talking faces, existing studies primarily focus on the correlation between avatars and driving sources. However, these studies often overlook emotion modeling, resulting in the generation of emotionless or unnatural facial animations. In response, this paper introduces an audio-driven and emotion-editing dynamic NeRF (AED-NeRF) approach, designed for the real-time generation of expressive talking face avatars driven by audio inputs. Specifically, we integrate audio features into a grid-based NeRF to compensate for the lack of a deformation channel, successfully capturing lip dynamics and enabling end-to-end generation from audio-driven sources to talking face avatars. Emotion labels, comprising emotion categories and intensity levels, guide the proposed NeRF framework to implicitly model visual emotions, allowing for explicit control and editing of facial expressions. Extensive qualitative and quantitative experiments validate the effectiveness and advantages of our proposed method, demonstrating its ability to achieve real-time, photo-realistic talking face avatar generation across different audio and emotion scenarios.

Table and Figures | Reference | Related Articles | Metrics

Select

Key Techniques and Challenges in NeRF-Based Dynamic 3D Reconstruction

LU Ping, FENG Daquan, SHI Wenzhe, LI Wan, LIN Jiaxin

ZTE Communications 2025, 23 (3): 71-80. DOI: 10.12142/ZTECOM.202503008

Abstract （192）

HTML （1）

PDF （675KB）（96）

Save

This paper explores the key techniques and challenges in dynamic scene reconstruction with neural radiance fields (NeRF). As an emerging computer vision method, the NeRF has wide application potential, especially in excelling at 3D reconstruction. We first introduce the basic principles and working mechanisms of NeRFs, followed by an in-depth discussion of the technical challenges faced by 3D reconstruction in dynamic scenes, including problems in perspective and illumination changes of moving objects, recognition and modeling of dynamic objects, real-time requirements, data acquisition and calibration, motion estimation, and evaluation mechanisms. We also summarize current state-of-the-art approaches to address these challenges, as well as future research trends. The goal is to provide researchers with an in-depth understanding of the application of NeRFs in dynamic scene reconstruction, as well as insights into the key issues faced and future directions.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-View Image-Based 3D Reconstruction in Indoor Scenes: A Survey

LU Ping, SHI Wenzhe, QIAO Xiuquan

ZTE Communications 2024, 22 (3): 91-98. DOI: 10.12142/ZTECOM.202403011

Abstract （271）

HTML （5）

PDF （750KB）（286）

Save

Three-dimensional reconstruction technology plays an important role in indoor scenes by converting objects and structures in indoor environments into accurate 3D models using multi-view RGB images. It offers a wide range of applications in fields such as virtual reality, augmented reality, indoor navigation, and game development. Existing methods based on multi-view RGB images have made significant progress in 3D reconstruction. These image-based reconstruction methods not only possess good expressive power and generalization performance, but also handle complex geometric shapes and textures effectively. Despite facing challenges such as lighting variations, occlusion, and texture loss in indoor scenes, these challenges can be effectively addressed through deep neural networks, neural implicit surface representations, and other techniques. The technology of indoor 3D reconstruction based on multi-view RGB images has a promising future. It not only provides immersive and interactive virtual experiences but also brings convenience and innovation to indoor navigation, interior design, and virtual tours. As the technology evolves, these image-based reconstruction methods will be further improved to provide higher quality and more accurate solutions to indoor scene reconstruction.

Table and Figures | Reference | Related Articles | Metrics

Select

Local Scenario Perception and Web AR Navigation

SHI Wenzhe, LIU Yanbin, ZHOU Qinfen

ZTE Communications 2023, 21 (4): 54-59. DOI: 10.12142/ZTECOM.202304007

Abstract （158）

HTML （2）

PDF （1324KB）（132）

Save

This paper proposes a local point cloud map-based Web augmented reality (AR) indoor navigation system solution. By delivering the local point cloud map to the web front end for positioning, the real-time positioning can be implemented only with the help of the computing power of the web front end. In addition, with the characteristics of short time consumption and accurate positioning, an optimization solution to the local point cloud map is proposed, which includes specific measures such as descriptor de-duplicating and outlier removal, thus improving the quality of the point cloud. In this document, interpolation and smoothing effects are introduced for local map positioning, enhancing the anchoring effect and improving the smoothness and appearance of user experience. In small-scale indoor scenarios, the positioning frequency on an iPhone 13 can reach 30 fps, and the positioning precision is within 50 cm. Compared with an existing mainstream visual-based positioning manner for AR navigation, this specification does not rely on any additional sensor or cloud computing device, thereby greatly saving computing resources. It takes a very short time to meet the real-time requirements and provide users with a smooth positioning effect.

Table and Figures | Reference | Related Articles | Metrics

Select

Scene Visual Perception and AR Navigation Applications

LU Ping, SHENG Bin, SHI Wenzhe

ZTE Communications 2023, 21 (1): 81-88. DOI: 10.12142/ZTECOM.202301010

Abstract （189）

HTML （1）

PDF （2185KB）（140）

Save

With the rapid popularization of mobile devices and the wide application of various sensors, scene perception methods applied to mobile devices occupy an important position in location-based services such as navigation and augmented reality (AR). The development of deep learning technologies has greatly improved the visual perception ability of machines to scenes. The basic framework of scene visual perception, related technologies and the specific process applied to AR navigation are introduced, and future technology development is proposed. An application (APP) is designed to improve the application effect of AR navigation. The APP includes three modules: navigation map generation, cloud navigation algorithm, and client design. The navigation map generation tool works offline. The cloud saves the navigation map and provides navigation algorithms for the terminal. The terminal realizes local real-time positioning and AR path rendering.

Table and Figures | Reference | Related Articles | Metrics