Deep Transformation-Invariant Clustering

Tom MonnierThibault GroueixMathieu Aubry

LIGM (UMR 8049) - Ecole des Ponts, UPE

Paper | Code


Recent advances in image clustering typically focus on learning better deep representations. In contrast, we present an orthogonal approach that does not rely on abstract features but instead learns to predict image transformations and directly performs clustering in pixel space. This learning process naturally fits in the gradient-based training of K-means and Gaussian mixture model, without requiring any additional loss or hyper-parameters. It leads us to two new deep transformation-invariant clustering frameworks, which jointly learn prototypes and transformations. More specifically, we use deep learning modules that enable us to resolve invariance to spatial, color and morphological transformations. Our approach is conceptually simple and comes with several advantages, including the possibility to easily adapt the desired invariance to the task and a strong interpretability of both cluster centers and assignments to clusters. We demonstrate that our novel approach yields competitive and highly promising results on standard image clustering benchmarks. Finally, we showcase its robustness and the advantages of its improved interpretability by visualizing clustering results over real photograph collections.


DTI framework

Deep transformation module T_f_k


Given an image x_i and prototypes c_1 and c_2, standard clustering such as K-means assigns the sample to the closest prototype. Our DTI clustering first aligns prototypes to the sample using a family of parametric transformations - here rotations - then picks the prototype whose alignment yields the smallest distance.

We predict alignment with deep learning. Given an image x_i, each deep parameter predictor f_k predicts parameters for a sequence of transformations - here affine, morphological and thin plate spline transformations - to align the prototype c_k to the query image x_i.


Standard image clustering benchmarks (click here for prototype transformation examples)


MegaDepth locations


Instagram hashtags


How to cite?

If you find this work useful in your research, please consider citing:

  title={Deep Transformation-Invariant Clustering},
  author={Monnier, Tom and Groueix, Thibault and Aubry, Mathieu},