Internship proposal: Automatic exemplar-based correspondences discovery for Art History

Mathieu Aubry (ENPC)

Figure 1:  Example of the same detail used in multiple contexts, with several types of variation in Jan Brueghel's work [1].



While painters take their inspiration from nature or their surroundings, they often use existing studies, sketches or paintings as a resource of visual elements that they can potentially reuse. This is evident, for instance, in the works of artists of the Brueghel family and workshop, who repeatedly borrowed details from existing work. An example of a detail that is repeated in several works is shown Figure 1. Sometimes the wagon and its horses are precisely duplicated; sometimes there are variations which still share a number of characteristic features. The network of related artworks that is created by such imitation and replication is of interest to art historians, as would be a detailed study of the processes of variation between individual instances. Unfortunately, art historians have no automatic tools to discover these relationships and therefore still manually examine works of art to find connections.


We plan to learn features that are invariant to style and associate to them exemplar-based detectors, similar to [3,6], that once calibrated could be used to construct a network of correspondences with sufficient quality to be useful to art historians. We will start by combining style transfer algorithms on real photographs [4,5] to learn style invariance, and recent techniques for cross-domain adaptation of convolutional neural network (CNN) features [6]. We will then learn completely new CNN features, better adapted to our matching task than the standard features which are trained for object category classification. To achieve this, we will build on recent work on self supervised feature learning [2] which uses an auxiliary task, taking advantage of the co-occurrences of patterns to learn meaningful features.

We plan to focus on the Brueghel Family dataset [1], collected by Elizabeth Honig’s team at UC Berkeley. Indeed, the limited size of this dataset (1488 images) as well as the large number of relevant correspondences it contains make it an ideal test case for our work.



[2] Doersch, C., Gupta, A., & Efros, A. A. . Unsupervised Visual Representation Learning by Context Prediction. ICCV 2015

[3] Doersch, C., Singh, S., Gupta, A., Sivic, J., & Efros, A. . What makes Paris look like Paris?. ACM Transactions on Graphics, 2012

[4] Gatys, L. A., Ecker, A. S., & Bethge, M. A neural algorithm of artistic style. arXiv 2015

[5] Hertzmann, A., Jacobs, C. E., Oliver, N., Curless, B., & Salesin, D. H. Image analogies. ACM conference on Computer graphics and interactive techniques 2001

[6] Massa, F., Russell, B., Aubry, M.,Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views arXiv preprint 2015.