ZTE Communications ›› 2024, Vol. 22 ›› Issue (4): 9-17.DOI: 10.12142/ZTECOM.202404003

• Special Topic • Previous Articles     Next Articles

Research on High-Precision Stochastic Computing VLSI Structures for Deep Neural Network Accelerators

WU Jingguo1, ZHU Jingwei1, XIONG Xiankui2,3, YAO Haidong2,3, WANG Chengchen2,3, CHEN Yun1()   

  1. 1.Fudan University, Shanghai 200433, China
    2.State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen 518055, China
    3.ZTE Corporation, Shenzhen 518057, China
  • Received:2024-08-04 Online:2024-12-20 Published:2024-12-03
  • About author:WU Jingguo received his BS degree in microelectronics science and engineering from Xi’an Jiaotong University, China in 2021. He is currently pursuing for a master’s degree at Fudan University, China. His main research interest is stochastic computing.
    ZHU Jingwei is a PhD candidate in the School of Microelectronics, Fudan University, China. He received his MASc degree from Fudan University. His current research interests include neuromorphic computing, spiking neural networks and NoC.
    XIONG Xiankui graduated from University of Electronic Science and Technology of China. He is the chief architect of the Wireless Department and the leader of the Prospect Team of the Smart Computing Technical Committee, ZTE Corporation. He has been engaged in long-term research on computing systems and architectures, advanced computing paradigms, and heterogeneous computing accelerators. He has led the system architecture design of the ZTE ATCA advanced telecom computing platform, server storage platform, smart NIC, and AI accelerator.
    YAO Haidong graduated from the University of Science and Technology of China, and he is a senior expert in the field of wireless communications at ZTE Corporation. He is mainly engaged in the research and design of deep learning, large language model network architecture, and compilation and conversion technology.
    WANG Chengchen graduated from the Department of Precision Instrument, Tsinghua University, China and is now working in ZTE Corporation. His research directions include computing in memory, optical calculation, and probability calculation.
    CHEN Yun (chenyun@fudan.edu.cn) received her BSc degree from University of Science and Technology of China in 2000, and PhD degree from Fudan University, China in 2007. She joined Fudan University at the same year, where she has been a professor with the State Key Laboratory of ASIC and Systems. She has published more than 60 articles in renowned international journals and conferences and applied for more than 20 patents. Her research interests include baseband processing technologies for wireless communication and ultralow power FEC IC design. Pro. CHEN is a member of the Steering Committee of SIPS and the ASICON Technical Committee. She serves as a TPC Member for ASSCC. She also serves as the Chair Secretary for the Shanghai Chapter of IEEE SSCS, and the Co-Chair for the Circuit System Division, the Chinese Institute of Electronics. She is a senior member of IEEE.

Abstract:

Deep neural networks (DNN) are widely used in image recognition, image classification, and other fields. However, as the model size increases, the DNN hardware accelerators face the challenge of higher area overhead and energy consumption. In recent years, stochastic computing (SC) has been considered a way to realize deep neural networks and reduce hardware consumption. A probabilistic compensation algorithm is proposed to solve the accuracy problem of stochastic calculation, and a fully parallel neural network accelerator based on a deterministic method is designed. The software simulation results show that the accuracy of the probability compensation algorithm on the CIFAR-10 data set is 95.32%, which is 14.98% higher than that of the traditional SC algorithm. The accuracy of the deterministic algorithm on the CIFAR-10 dataset is 95.06%, which is 14.72% higher than that of the traditional SC algorithm. The results of Very Large Scale Integration Circuit (VLSI) hardware tests show that the normalized energy efficiency of the fully parallel neural network accelerator based on the deterministic method is improved by 31% compared with the circuit based on binary computing.

Key words: stochastic computing, hardware accelerator, deep neural network