PhD Proposal: Discovering Latent Structures in Data through Learning Disentangled Representations

Talk
Vaishnavi Patil
Time: 
09.10.2025 15:30 to 17:00
Location: 

IRB-4107

The field of unsupervised learning in machine learning concerns itself with organizing and understanding patterns in dataset without the use of an oracle capable of giving us the ground-truth labels. A primary objective of unsupervised learning is not merely to perform density estimation or generate realistic samples, but to uncover and characterize the latent structure inherent in observational data. Central to this objective is the challenge of modeling high-dimensional dependencies in a manner that admits interpretable or functionally meaningful representations. This task of factoring data into meaningful representations is known as \textit{disentanglement}. In this context, \textit{unsupervised disentanglement} has emerged as a critical problem: the task of isolating distinct generative factors of variation, corresponding to semantic concepts, directly from data, and encoding them into statistically and structurally orthogonal latent subspaces, all without recourse to supervision. Despite significant progress, it has been established that, in the absence of additional constraints, generative models may replicate the observed distribution yet fail to yield disentangled representations, a limitation fundamentally tied to identifiability issues, analogous to those in nonlinear independent component analysis. We first discuss the theoretical foundations and into the role of inductive biases, which we investigate in this thesis as a necessary mechanism for achieving disentanglement in unsupervised settings.
After a theoretical background, we then discuss specific methodologies for embedding these inductive biases within generative neural network architectures, with particular emphasis on regularization strategies that systematically influence the structural formation of latent representations. In our first work, we introduce a novel approach to impose the inductive bias of local isometry by leveraging self-supervised metric learning, thereby encouraging the preservation of local geometric structures within the learned representations. Building upon foundational concepts from sparse coding and discretization frameworks, in our second work, we further discuss a principled probabilistic approach employing structured nonparametric priors to strengthen the inductive biases, thereby enhancing the interpretability.
Further, we propose strategies that systematically relax restrictive assumptions commonly imposed in prior frameworks, particularly those constraining representational flexibility and scalability, thereby rendering disentangled representation learning a more tractable and broadly applicable paradigm. Specifically, we leverage disentangled representations to achieve compositional generalization, enabling systematic extrapolation to novel combinations of learned latent factors and enhancing sample efficiency alongside robust generalization to out-of-distribution scenarios. We propose the adoption of tailored nonparametric prior structures that facilitate the continual incorporation of novel semantic factors within the representational framework, thereby substantially enhancing the scalability of the overall learning paradigm.