Deep Multi-view Stereo gone wild

François Darmon1, 2   Bénédicte Bascle1   Jean-Clément Devaux1Pascal Monasse2Mathieu Aubry2

1Thales LAS France  |  2 LIGM Ecole des Ponts, Univ Gustave Eiffel, CNRS, France
teaser.jpg

Paper |  Code

Abstract


Deep multi-view stereo (deep MVS) methods have been developed and extensively compared on simple datasets, where they now outperform classical approaches. In this paper, we ask whether the conclusions reached in controlled scenarios are still valid when working with Internet photo collections. We propose a methodology for evaluation and explore the influence of three aspects of deep MVS methods: network architecture, training data, and supervision. We make several key observations, which we extensively validate quantitatively and qualitatively, both for depth prediction and complete 3D reconstructions. First, we outline the promises of unsupervised techniques by introducing a simple approach which provides more complete reconstructions than supervised options when using a simple network architecture. Second, we emphasize that not all multiscale architectures generalize to the unconstrained scenario, especially without supervision. Finally, we show the efficiency of noisy supervision from large-scale 3D reconstructions which can even lead to networks that outperform classical methods in scenarios where very few images are available.

Bibtex


          @article{
            author    = {Darmon, Fran{\c{c}}ois  and
                         Bascle, B{\'{e}}n{\'{e}}dicte  and
                         Devaux, Jean{-}Cl{\'{e}}ment  and
                         Monasse, Pascal  and
                         Aubry, Mathieu},
            title     = {Deep Multi-View Stereo gone wild},
            year      = {2021},
            url       = {https://arxiv.org/abs/2104.15119},
          }

Acknowledgements


This work was supported in part by ANR project EnHerit ANR-17-CE23-0008 ANR-17-CE23-0008 and was granted access to the HPC resources of IDRIS under the allocation 2020-AD011011756 made by GENCI. We thank Tom Monnier, Michael Ramamonjisoa and Vincent Lepetit for valuable feedbacks.