As more medical data become digitalized, machine learning is regarded as a promising tool for constructing medical decision support systems. Even with vast medical data volumes, machine learning is still not fully exploiting its potential because the data usually sits in data silos, and privacy and security regulations restrict their access and use. To address these issues, we built a secured and explainable machine learning framework, called explainable federated XGBoost (EXPERTS), which can share valuable information among different medical institutions to improve the learning results without sharing the patients’ data. It also reveals how the machine makes a decision through eigenvalues to offer a more insightful answer to medical professionals. To study the performance, we evaluate our approach by real-world datasets, and our approach outperforms the benchmark algorithms under both federated learning and non-federated learning frameworks.
Federated learning (FL) is a machine learning paradigm for data silos and privacy protection，which aims to organize multiple clients for training global machine learning models without exposing data to all parties. However, when dealing with non-independently identically distributed (non-IID) client data, FL cannot obtain more satisfactory results than centrally trained machine learning and even fails to match the accuracy of the local model obtained by client training alone. To analyze and address the above issues, we survey the state-of-the-art methods in the literature related to FL on non-IID data. On this basis, a motivation-based taxonomy, which classifies these methods into two categories, including heterogeneity reducing strategies and adaptability enhancing strategies, is proposed. Moreover, the core ideas and main challenges of these methods are analyzed. Finally, we envision several promising research directions that have not been thoroughly studied, in hope of promoting research in related fields to a certain extent.
Decentralized machine learning frameworks, e.g., federated learning, are emerging to facilitate learning with medical data under privacy protection. It is widely agreed that the establishment of an accurate and robust medical learning model requires a large number of continuous synchronous monitoring data of patients from various types of monitoring facilities. However, the clinic monitoring data are usually sparse and imbalanced with errors and time irregularity, leading to inaccurate risk prediction results. To address this issue, this paper designs a medical data resampling and balancing scheme for federated learning to eliminate model biases caused by sample imbalance and provide accurate disease risk prediction on multi-center medical data. Experimental results on a real-world clinical database MIMIC-IV demonstrate that the proposed method can improve AUC (the area under the receiver operating characteristic) from 50.1% to 62.8%, with a significant performance improvement of accuracy from 76.8% to 82.2%, compared to a vanilla federated learning artificial neural network (ANN). Moreover, we increase the model’s tolerance for missing data from 20% to 50% compared with a stand-alone baseline model.
Recent years have witnessed a spurt of progress in federated learning, which can coordinate multi-participation model training while protecting the data privacy of participants. However, low communication efficiency is a bottleneck when deploying federated learning to edge computing and IoT devices due to the need to transmit a huge number of parameters during co-training. In this paper, we verify that the outputs of the last hidden layer can record the characteristics of training data. Accordingly, we propose a communication-efficient strategy based on model split and representation aggregate. Specifically, we make the client upload the outputs of the last hidden layer instead of all model parameters when participating in the aggregation, and the server distributes gradients according to the global information to revise local models. Empirical evidence from experiments verifies that our method can complete training by uploading less than one-tenth of model parameters, while preserving the usability of the model.
Federated learning (FL) has developed rapidly in recent years as a privacy-preserving machine learning method, and it has been gradually applied to key areas involving privacy and security such as finance, medical care, and government affairs. However, the current solutions to FL rarely consider the problem of migration from centralized learning to federated learning, resulting in a high practical threshold for federated learning and low usability. Therefore, we introduce a reliable, efficient, and easy-to-use federated learning framework named Neursafe-FL. Based on the unified application program interface (API), the framework is not only compatible with mainstream machine learning frameworks, such as Tensorflow and Pytorch, but also supports further extensions, which can preserve the programming style of the original framework to lower the threshold of FL. At the same time, the design of componentization, modularization, and standardized interface makes the framework highly extensible, which meets the needs of customized requirements and FL evolution in the future. Neursafe-FL is already on Github as an open-source project1.
Low-cost, flexible and intelligent optical performance monitoring and management is a key enabling technology for network quality guarantee, especially in the era of explosive growth of communication capacity and network scale. However, to the best of our knowledge, it is extremely challenging to implement real-time performance monitoring and operations, administration and maintenance (OAM) in a highly complex dynamic network. In this paper, we propose an innovative optical identification (OID) scheme that can realize both performance monitoring and some advanced OAM sub-functions. The basic concepts, applications, challenges and evolution directions of this OID tool are also discussed.
We consider spectrum sensing problems in the orthogonal frequency division multiplexing access (OFDMA) cognitive radio scenario, where a secondary user with multiple antennas detects several consecutive subcarriers of an entire OFDM symbol occupied by multiple primary users. Specifically, an OFDM multicarrier covariance matrix convolutional neural network (CNN)-based approach is proposed for simultaneously detecting the occupancy of all OFDM subcarriers, where the multicarrier sample covariance matrix array is specially set as the input of the CNN. The proposed approach can efficiently learn the energy information and correlation information between antennas and between subcarriers to significantly improve the spectrum sensing performance. Numerical results demonstrate that the proposed method has a substantial performance advantage over the state-of-the-art spectrum sensing methods in an OFDMA scenario under the 5G new radio network.
A new optimization method is proposed to realize the synthesis of duplexers. The traditional optimization method takes all the variables of the duplexer into account, resulting in too many variables to be optimized when the order of the duplexer is too high, so it is not easy to fall into the local solution. In order to solve this problem, a new optimization strategy is proposed in this paper, that is, two-channel filters are optimized separately, which can reduce the number of optimization variables and greatly reduce the probability of results falling into local solutions. The optimization method combines the self-adaptive differential evolution algorithm (SADE) with the Levenberg-Marquardt (LM) algorithm to get a global solution more easily and accelerate the optimization speed. To verify its practical value, we design a 5G duplexer based on the proposed method. The duplexer has a large external coupling, and how to achieve a feed structure with a large coupling bandwidth at the source is also discussed. The experimental results show that the proposed optimization method can realize the synthesis of higher- order duplexers compared with the traditional methods.
A distributed information network with complex network structure always has a challenge of locating fault root causes. In this paper, we propose a novel root cause analysis (RCA) method by random walk on the weighted fault propagation graph. Different from other RCA methods, it mines effective features information related to root causes from offline alarms. Combined with the information, online alarms and graph relationship of network structure are used to construct a weighted graph. Thus, this approach does not require operational experience and can be widely applied in different distributed networks. The proposed method can be used in multiple fault location cases. The experiment results show the proposed approach achieves much better performance with 6% higher precision at least for root fault location, compared with three baseline methods. Besides, we explain how the optimal parameter’s value in the random walk algorithm influences RCA results.
Microservices have become popular in enterprises because of their excellent scalability and timely update capabilities. However, while fine-grained modularity and service-orientation decrease the complexity of system development, the complexity of system operation and maintenance has been greatly increased, on the contrary. Multiple types of system failures occur frequently, and it is hard to detect and diagnose failures in time. Furthermore, microservices are updated frequently. Existing anomaly detection models depend on offline training and cannot adapt to the frequent updates of microservices. This paper proposes an anomaly detection approach for microservice systems with multi-source data streams. This approach realizes online model construction and online anomaly detection, and is capable of self-updating and self-adapting. Experimental results show that this approach can correctly identify 78.85% of faults of different types.
Symbiotic radio (SR) is an emerging green technology for the Internet of Things (IoT). One key challenge of the SR systems is to design efficient and low-complexity detectors, which is the focus of this paper. We first drive the mathematical expression of the optimal maximum-likelihood (ML) detector, and then propose a suboptimal iterative detector with low complexity. Finally, we show through numerical results that our proposed detector can obtain near-optimal bit error rate (BER) performance at a low computational cost.