ZTE Communications ›› 2022, Vol. 20 ›› Issue (3): 17-26.DOI: 10.12142/ZTECOM.202203003

• Special Topic • Previous Articles     Next Articles

A Survey of Federated Learning on Non-IID Data

HAN Xuming1, GAO Minghan2, WANG Limin3(), HE Zaobo1, WANG Yanze1   

  1. 1.Jinan University, Guangzhou 510632, China
    2.Changchun University of Technology, Changchun 130012, China
    3.Guangdong University of Finance & Economics, Guangzhou 510320, China
  • Received:2022-06-10 Online:2022-09-13 Published:2022-09-14
  • About author:HAN Xuming received his PhD degree from Jilin University, China. Now he is a professor and PhD supervisor at Jinan University, China. He is in charge of about 10 important scientific research projects and 80 journal papers and conference papers, and has publish four academic monographs. His research interests include artificial intelligence, federated Learning, and machine learning.|GAO Minghan is currently a graduate student in Changchun University of Technology, China. His research interests include federated learning, multitasking optimization, and clustering.|WANG Limin (20211016@gdufe.edu.cn) received her master's and PhD degrees in computer science and technology from Jilin University, China in 2004 and 2007, respectively. Now she is a professor with the Guangdong University of Finance & Economics, China. Her current research interests include big data analysis, evolutionary algorithm, and intelligent decision optimization. She is a member of China Computer Federation. She has published more than 90 research papers in international and domestic journals or international conferences.|HE Zaobo received his PhD degree from Georgia State University, USA, MS degree from Shaanxi Normal University, China, and BS degree from Yan’an University, China, all in the Department of Computer Science. Dr. HE is currently a professor in the Department of Computer Science at Jinan University, China. His research areas focus on data privacy and Internet of Things.|WANG Yanze is currently a graduate student in Jinan University, China. His research interests include federated learning, pattern recognition, and computer vision.

Abstract:

Federated learning (FL) is a machine learning paradigm for data silos and privacy protection,which aims to organize multiple clients for training global machine learning models without exposing data to all parties. However, when dealing with non-independently identically distributed (non-IID) client data, FL cannot obtain more satisfactory results than centrally trained machine learning and even fails to match the accuracy of the local model obtained by client training alone. To analyze and address the above issues, we survey the state-of-the-art methods in the literature related to FL on non-IID data. On this basis, a motivation-based taxonomy, which classifies these methods into two categories, including heterogeneity reducing strategies and adaptability enhancing strategies, is proposed. Moreover, the core ideas and main challenges of these methods are analyzed. Finally, we envision several promising research directions that have not been thoroughly studied, in hope of promoting research in related fields to a certain extent.

Key words: data heterogeneity, federated learning, non-IID data