Clustering of single-cell multi-omics data with a multimodal deep learning method

Tags
Single-cell modleing
Affiliation
University of Pennsylvania
Article type
Research
Date
2023/10/18
Journal
Nature Comm.
Published Year
2022
keywords
Multi-omics integration
scMDC
xianglin226
230113_scClustering.pptx
13553.7KB

Goal

cell representation learning

Task (in ML perspective)

clustering

Challenges

Dropout events
Embedding for clustering

Methods

Dataset
Multiple modalities
scCITE-seq : mRNA + ADT
scSMAGE-seq: mRNA + ATAC
scATAC-seq
scRNA-seq
Method
input : CITE-seq or SMAGE-seq dataset
output :
Detailed method
Reconstruction loss: ZINB (Zero enflated negative binomial) loss
KL loss
K-means Clustering loss
최종 loss
argminw,wa,wr,ULtotal(Xa,Xrw,wa,wr,U)=LmRNA(Xrw,wr)+LATAC(Xaw,wa)+γLc(Xr,Xaw,U)+φLkl(Xr,Xaw)\mathop{{{{{{\rm{argmin}}}}}}}\limits_{{{{{{\bf{w}}}}}},{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{a}}}}}}}^{{\prime} },{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{r}}}}}}}^{{\prime} },U}{L}_{{total}}({{{{{{\bf{X}}}}}}}^{{{{{{\bf{a}}}}}}},{{{{{{\bf{X}}}}}}}^{{{{{{\bf{r}}}}}}}{{|}}{{{{{\bf{w}}}}}},{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{a}}}}}}}^{{\prime} },{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{r}}}}}}}^{{\prime} },{{{{{\bf{U}}}}}}) \\= {L}_{{mRNA}}({{{{{{\boldsymbol{X}}}}}}}^{{{{{{\boldsymbol{r}}}}}}}|{{{{{\bf{w}}}}}},{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{r}}}}}}}^{{\prime} })+{L}_{{ATAC}}({{{{{{\bf{X}}}}}}}^{{{{{{\bf{a}}}}}}}{{|}}{{{{{\bf{w}}}}}},{{{{{{\bf{w}}}}}}}_{{{{{{\boldsymbol{a}}}}}}}^{{\prime} }){+\gamma * L}_{c}({{{{{{\bf{X}}}}}}}^{{{{{{\bf{r}}}}}}},{{{{{{\bf{X}}}}}}}^{{{{{{\bf{a}}}}}}}{{|}}{{{{{\bf{w}}}}}},{{{{{\bf{U}}}}}}) \\ + {\varphi * L}_{kl}\left({{{{{{\bf{X}}}}}}}^{{{{{{\bf{r}}}}}}},{{{{{{\bf{X}}}}}}}^{{{{{{\bf{a}}}}}}}{{|}}{{{{{\bf{w}}}}}}\right)

Results

1. [CITE-seq dataset] 다른 clustering method와 비교

Evaluation metric: ARI, AMI, NMI

2. [SMAGE-seq dataset] 다른 clustering method와 비교

3. [simulation dataset] 다른 clustering method와 비교

4-5. Low-dimension representation

6. The advantages of using multimodal data

7. Downstream analysis: DE, GSEA