학술논문

Improved integration of single-cell transcriptome and surface protein expression by LinQ-View
Document Type
article
Source
Cell Reports: Methods, Vol 1, Iss 4, Pp 100056- (2021)
Subject
scRNA-seq
CITE-seq
multimodal method
integrated model
purity metric
computational method
Biotechnology
TP248.13-248.65
Biochemistry
QD415-436
Science
Language
English
ISSN
2667-2375
Abstract
Summary: Multimodal advances in single-cell sequencing have enabled the simultaneous quantification of cell surface protein expression alongside unbiased transcriptional profiling. Here, we present LinQ-View, a toolkit designed for multimodal single-cell data visualization and analysis. LinQ-View integrates transcriptional and cell surface protein expression profiling data to reveal more accurate cell heterogeneity and proposes a quantitative metric for cluster purity assessment. Through comparison with existing multimodal methods on multiple public CITE-seq datasets, we demonstrate that LinQ-View efficiently generates accurate cell clusters, especially in CITE-seq data with routine numbers of surface protein features, by preventing variations in a single surface protein feature from affecting results. Finally, we utilized this method to integrate single-cell transcriptional and protein expression data from SARS-CoV-2-infected patients, revealing antigen-specific B cell subsets after infection. Our results suggest LinQ-View could be helpful for multimodal analysis and purity assessment of CITE-seq datasets that target specific cell populations (e.g., B cells). Motivation: Multimodal single-cell sequencing enables multiple aspects for characterizing the dynamics of cell states and developmental processes. Properly integrating information from multiple modalities is a crucial step for interpreting cell heterogeneity. Here, we present LinQ-View, a computational workflow that provides an effective solution for integrating multiple modalities of CITE-seq data for downstream interpretation. LinQ-View balances information from multiple modalities to achieve accurate clustering results and is specialized in handling CITE-seq data with routine numbers of surface protein features.