학술논문

Missing value imputation in proximity extension assay-based targeted proteomics data.
Document Type
Academic Journal
Author
Lenz M; Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany.; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Schulz A; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Koeck T; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Rapp S; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Nagler M; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Sauer M; Disease Genomics, Bayer AG, Wuppertal, Germany.; Eggebrecht L; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Ten Cate V; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Panova-Noeva M; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Prochaska JH; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Lackner KJ; Institute of Clinical Chemistry and Laboratory Medicine, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Münzel T; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Center for Cardiology, Cardiology I, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Leineweber K; Disease Genomics, Bayer AG, Wuppertal, Germany.; Wild PS; Preventive Cardiology and Preventive Medicine-Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, Mainz, Germany.; Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.; Andrade-Navarro MA; Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany.
Source
Publisher: Public Library of Science Country of Publication: United States NLM ID: 101285081 Publication Model: eCollection Cited Medium: Internet ISSN: 1932-6203 (Electronic) Linking ISSN: 19326203 NLM ISO Abbreviation: PLoS One Subsets: MEDLINE
Subject
Language
English
Abstract
Targeted proteomics utilizing antibody-based proximity extension assays provides sensitive and highly specific quantifications of plasma protein levels. Multivariate analysis of this data is hampered by frequent missing values (random or left censored), calling for imputation approaches. While appropriate missing-value imputation methods exist, benchmarks of their performance in targeted proteomics data are lacking. Here, we assessed the performance of two methods for imputation of values missing completely at random, the previously top-benchmarked 'missForest' and the recently published 'GSimp' method. Evaluation was accomplished by comparing imputed with remeasured relative concentrations of 91 inflammation related circulating proteins in 86 samples from a cohort of 645 patients with venous thromboembolism. The median Pearson correlation between imputed and remeasured protein expression values was 69.0% for missForest and 71.6% for GSimp (p = 5.8e-4). Imputation with missForest resulted in stronger reduction of variance compared to GSimp (median relative variance of 25.3% vs. 68.6%, p = 2.4e-16) and undesired larger bias in downstream analyses. Irrespective of the imputation method used, the 91 imputed proteins revealed large variations in imputation accuracy, driven by differences in signal to noise ratio and information overlap between proteins. In summary, GSimp outperformed missForest, while both methods show good overall imputation accuracy with large variations between proteins.
Competing Interests: Michael Lenz, Andreas Schulz, Thomas Koeck, Steffen Rapp, Markus Nagler, Lisa Eggebrecht, Vincent Ten Cate, Karl J. Lackner, Thomas Münzel and Miguel A. Andrade-Navarro declare no conflict of interest. Madeleine Sauer and Kirsten Leineweber are employees of Bayer AG. Marina Panova-Noeva, Jürgen H. Prochaska, and Philipp S. Wild received funding from the Center for Thrombosis and Hemostasis Mainz. Philipp S. Wild reports grants from Bayer AG and from the German Federal Ministry of Education and Research, during the conduct of the study; grants and personal fees from Boehringer Ingelheim, grants from Philips Medical Systems, grants and personal fees from Sanofi-Aventis, grants and personal fees from Bayer Vital, grants from Daiichi Sankyo Europe, personal fees from Bayer Health Care, personal fees from Astra Zeneca, personal fees and non-financial support from Diasorin and non-financial support from I.E.M., outside the submitted work. This does not alter our adherence to PLOS ONE policies on sharing data and materials.