Séminaire AROBAS – A comparative evaluation of biologically interpretable VAEs applied to single-cell RNA-seq data
Charlotte JOB, doctorante équipe AROBAS.
Abstract
Variational Autoencoders (VAEs) are powerful representation-learning tools that have been widely used for generating informative low-dimensional representations, i.e. embeddings, from high dimensional and complex single-cell RNA-seq data. Such embeddings can be used in several downstream tasks including cell type annotation. However, the lack of human-understandable explanation limits expert trust and, consequently, hinders biological discovery. Existing interpretable VAEs constrain representation learning by prior biological knowledge. Particularly, VAEs are trained to map expression levels of genes’ subsets into biological concepts. In this work, we review three core approaches: (i) Vega, a VAE implementing the aforementioned mapping using a simple linear decoder, (ii) OntoVAE, a VAE implementing a hierarchical linear decoder which reflects the gene ontology, and (iii) pmVAE, a VAE where both encoder and decoder are constrained to embed biological knowledge. We observed that interpretability evaluation metrics remain underdeveloped. To this end, we propose a novel framework that enables a deeper interpretability evaluation. Our findings show that interpretability constraints don’t significantly trade off the models’ performance in terms of reconstruction, clustering quality or cell type prediction. Direct encoding of gene ontology with OntoVAE showed a limited interpretability performance because skip connections hinder capturing biological signals. Vega and pmVAE succeeded in capturing several biological aspects by mapping biological pathways to specific neurons. However, our metrics reveal that further refinement is needed to achieve a true correspondence between neurons and pathways.
- Date: 04/06/2026, 13h30-14h
- Lieu: IBISC, site IBGBI, 3ème étage, salle de réunion
- Organisation: Salma MAKBOUL (MCF Univ. Évry, IBISC équipe AROBAS)