학술논문

Thoughts on Non-IID Data Impact in Healthcare with Federated Learning Medical Blockchain
Document Type
Conference
Source
2022 IEEE 4th International Conference on Cognitive Machine Intelligence (CogMI) COGMI Cognitive Machine Intelligence (CogMI), 2022 IEEE 4th International Conference on. :20-26 Dec, 2022
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Measurement
Federated learning
Hospitals
Sociology
Training data
Data aggregation
Data models
Medical blockchain
non-iid
health data privacy
medical data set aggregation
distributed federated medical data lake
AI
federated learning
Language
Abstract
We share the common hypothesis/belief that the more aggregated good quality training data, the better the performance that can be attained by the resulting Artificial Intelligence (AI) model. However, this common belief, in general, is not true in the medical area, since healthcare data sets sourced from different hospitals are often not identically distributed (Non-IID). This imposes severe technical challenges for effectively aggregating the individual hospital data sets together. In this vision paper, instead of offering complete solutions, we will discuss some questions and food for thought with the goal of aiding effective data aggregation and improving federated learning (FL) AI model performance: (1) benchmark and measure the Non-IID degree of medical data sets. (2) include the Non-IID degree metrics in the FL data aggregation mechanism. (3) search for the optimal global model creation strategy among a group of many medical data sets. (4) investigate FL performance better than the centralized learning. This paper will discuss these questions by outlining a visionary approach for exploring a medical blockchain FL mechanism to effectively aggregate medical data across multiple healthcare systems to serve large populations with broad demographics.