Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inference

Fig. 1

Outline of the SupirFactor framework. The SupirFactor model is constructed like an autoencoder where we embed gene expression data on the transcription factor manifold, exploring two architectures, the “hierarchical” A, and the shallow B architecture. The output of the first layer defines the latent features marked as TFs (Transcription Factors) and the activation \(\varvec{\phi }\) is the transcription factor activity (TFA). The prior \(\varvec{P}\) connect the evidence of TF to a set of informative downstream genes, with learnable weights \(\varvec{W}\). For A, \(\varvec{\Pi }\) connects the TFs to the latent features, here called the meta TFs (mTFs). \(\varvec{\Theta }\) weights the mTF activity (mTFA) to predict genome wide gene expression profiles. In B, the TFs directly weights TF to gene influence in \(\varvec{\Theta }\). C: To make the model completely interpretable and transparent we use explained relative variance (ERV) \(\xi ^{2}\). ERV estimate importance of all latent factors influence on model output features. This is then used to evaluate the model and its performance. The GRN is cross validated, where genes to TF connections are held out in the input \(\varvec{W}\) and predicted in the GRN which for the shallow model is \(\varvec{\Theta }\) and for the hierarchical model is \(\check{\varvec{\Theta }}\) the indirect effect from the TFs to output features. The measured recovery of these links gives insight on stability and biological relevance of the GRN where parameters are ranked by their predictability measured by \(\xi ^{2}\). D Gene regulatory network extracted as indirect TF-gene interaction in hierarchical SupirFactor and direct TF-gene interactions in shallow SupirFactor. E Multi-task learning is implemented in SupirFactor through a joint representation learning (JRL) architecture where biological distinct contexts is independently weighted into a joint GRN representation. F Architecture pruning and sparsity procedure in SupirFactor is used to stabilize and eliminate over-parameterization by eliminating non-predicting model parameters facilitated be ERV

Back to article page