Report - Variational Sparse Coding

Unsupervised learning in high-dimensional data poses significant challenges, particularly in discovering interpretable features and enabling controllable generation.

Variational Autoencoders (VAEs) provide a probabilistic framework for mapping complex data to lower-dimensional latent spaces, but they often fail to disentangle underlying factors of variation, especially when the number of true sources is unknown or when observations exhibit diverse attribute combinations.

The “Variational Sparse Coding” (VSC) paper by Francesco Tonolini et al. proposes a novel extension of VAEs, incorporating sparsity in the latent space via a Spike and Slab prior to address these issues.

This report for a statistical course explores the VSC model, detailing its theoretical foundations, implementation, and empirical validation.

0.1 Datasets

MNIST :
- A dataset of 28×28 grayscale images of handwritten digits (0–9), consisting of 60,000 training samples and 10,000 test samples. It is widely used as a benchmark for image classification and serves as a foundational dataset for testing generative models and neural network architectures.
Fashion-MNIST:
- A collection of 28x28 grayscale images of clothing items (e.g., T-shirts, trousers), comprising 60,000 training and 10,000 test samples. It serves as a drop-in replacement for MNIST, offering richer variability for testing generative models.
Smiley (Used in the original paper) :
- A dataset of 28×28 grayscale images depicting simple smiley faces with varying expressions. Designed as a toy dataset, it is useful for evaluating generative and classification models on abstract, low-complexity visual data.
UCI HAR (Used in the original paper):
- A dataset of accelerometer and gyroscope signals from smartphones, capturing six human activities (e.g., walking, sitting) across 30 subjects, with preprocessed segments of 128 time steps.

0.2 Report Structure

Autoencoders: Introduces the basics of autoencoders and their role in high-dimensional data analysis.
Variational Autoencoders: Explains VAEs, focusing on the Evidence Lower Bound (ELBO) and latent space regularization.
Variational Sparse Coding: Details the VSC model, including the Spike and Slab prior and warm-up training strategy.
Experiments: Summarizes empirical results, comparing VSC to β-VAEs across multiple tasks.
Conclusion: Highlights VSC’s contributions and future directions.