Exploring Representational Alignment with Human Perception Using Identically Represented Inputs

Surveys for human invariance


We contribute to the study of the quality of learned representations. In many domains, an important evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), i.e., inputs which have similar representations (on a given layer) of a neural network. If these IRIs look similar to humans then a neural network’s learned invariances are said to align with human perception. However, we show that prior studies on the alignment of invariances between DNNs and humans are `biased' by the specific loss function used to generate IRIs. We show how different loss functions can lead to different takeaways about a model’s shared invariances with humans. We show that under an adversarial IRI generation process all models appear to have very little shared invariance with humans. We conduct an in-depth investigation of how different components of the deep learning pipeline contribute to learning models that have good alignment with human’s invariances. We find that architectures with residual connections trained using a self-supervised contrastive loss with ll_p ball adversarial data augmentation tend to learn the most human-like invariances.

Ayan Majumdar
Ayan Majumdar
PhD Student in Computer Science

My research interests broadly encompass applications of machine learning in decision-making and high-stakes scenarios while ensuring the fairness, explainability and robustness of such systems.