Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: GTM-decon: guided-topic modeling of single-cell transcriptomes enables sub-cell-type and disease-subtype deconvolution of bulk transcriptomes

Fig. 4

Deconvolution of bulk RNA-seq samples for pancreatic cancer from TCGA-PAAD. GTM-decon was first trained on an scRNA-seq dataset from individuals with pancreatic cancer. The trained GTM-decon model was then used to infer the cell-type proportions of the 174 TCGA-PAAD bulk RNA-seq profiles. a The average inferred cell-type proportion across the TCGA-PAAD tumor samples. We summed up inferred cell-type proportions over all samples followed by normalization. The pie chart displays the resulting percentage of cell-type proportions. b Inferred cell-type proportions of individual TCGA-PAAD tumor samples. To complement the inferred proportions of known cell types, we also ran unguided topic model (i.e., LDA) on the TCGA-PAAD bulk RNA-seq profiles directly to detect de novo topics that are not present in scRNA-seq reference. The heatmap visualizes the combined deconvolution results based on the 10 pancreatic cell types, and 10 de novo topics (i.e., columns). Each of the 174 rows represents a subject. Three types of demographic or clinical phenotypes were shown in the legend to aid result interpretation. These include “days to death,” cancer subtype, and whether the cancer type is PNET or not. The regions in the highlighted boxes were discussed in more details in the main text. c Survival analysis of the CTS and de novo topics using inferred cell-type proportions. The 174 subjects were divided into two groups based on K*-means clustering with K* set to 2 (not to be confused with the K cell types or topics). Kaplan–Meier curves were generated for these groups and compared using log-rank test. The plot shows the − log10(p-value) from the log-rank test for all the CTS and de novo topics, in decreasing order of significance. d Kaplan–Meier curve for endocrine cell-type proportion. The curve and shaded area represent the mean and standard deviation of the cell-type proportions in the two groups, respectively. The number of subjects for each cluster was indicated in the bottom panel

Back to article page