학술논문

A federated AI-empowered platform for disease management across a Pan-European data driven hub
Document Type
Conference
Source
2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI) Biomedical and Health Informatics (BHI), 2022 IEEE-EMBS International Conference on. :1-4 Sep, 2022
Subject
Bioengineering
Computing and Processing
Signal Processing and Analysis
Training
Sensitivity
Terminology
Semantics
Ecosystems
Europe
Distributed databases
data interoperability
data curation
data harmonization
federated AI
synthetic data
Language
ISSN
2641-3604
Abstract
Nowadays there is an intensive need to move towards a universal health data ecosystem by breaking down data silos. Faced with a wealth of dispersed health data, there are still critical open issues and unmet needs to make this feasible, varying from secure data sharing to data quality and heterogeneity. Considering these challenges, we propose a novel federated platform to unlock the full potential of data from health data intermediaries through the secure sharing, curation, and Natural Language Processing (NLP)-based harmonization of dispersed and complex clinical data structures. The platform was deployed to establish a first Pan-European data hub on rare autoimmune and chronic diseases with 7551 harmonized patient records across 21 European countries with a 90% terminology overlap. An advanced data driven imputer was built to predict missing records in the real patient data based on high-quality synthetic data profiles (with Kullback-Leibler divergence less than 0.01). with reduced fault detection rate (less than 2%) compared to conventional imputers, such as, the kNN imputer. Customized and explainable federated AI algorithms were trained on top of the established data hub for lymphomagenesis modeling with 0.87 sensitivity and 0.74 specificity along with a set of validated biomarkers for disease onset and progression.