EIDA: Editing and analysing historical astronomical diagrams with artificial intelligence

Description of image

Syrine Kalleli1*, Scott Trigg2*, Ségolène Albouy2, Samuel Gessner3, Mathieu Husson2, Mathieu Aubry1

1LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, France
1SYRTE, Observatoire de Paris-PSL, CNRS, Paris, France
3CIUHCT, Faculdade de Ciências, Universidade de Lisboa, Portuga
*corresponding authors

Abstract

The EIDA project explores the historical use of astronomical diagrams across Asia, Africa, and Europe. We aim to develop automatic image analysis tools to analyze and edit these diagrams without human annotation, gaining a refined understanding of their role in shaping and transmitting astronomy. In this paper, we present a baseline method to detects lines and circles in historical diagrams, based on text removal, edge detection and RANSAC. We compare this strong baseline to a deep learning approach based on LETR. This work contributes to historical diagram vectorization, enabling novel methods of comparison and clustering, and offering fresh insights into the vast corpus of astronomical diagrams.

Method

We introduce two diagram vectorization techniques. The initial method utilizes RANSAC (Fischler et al., 1981) for detecting lines and circles on contour images. Our edge detection relies on Canny (Canny, 1986), complemented by TESTR (Zhang et al., 2022), a text-spotting transformer network, to effectively identify text. The second approach is a deep learning model based on LETR (Xu et al., 2021), a transformer line segment detector that we extend to perform circle detection as well. Trained on randomly generated synthetic data, we evaluate and compare both methods using a set of 15 annotated diagrams. Our experimental validation demonstrates the superior performance of the learning-based approach over the RANSAC baseline.

Description of image
Table 1: comparison of our RANSAC baseline (with and without text removal) with our LETR extension.

Acknowledgements

This work was supported by ANR (project EIDA ANR-22-CE38-0014). The work of Scott Trigg is supported by the European Research Council (ERC project NORIA, grant 724175). Mathieu Aubry and Syrine Kalleli were supported by ERC project DISCOVER funded by the European Union's Horizon Europe Research and Innovation program under grant agreement No. 101076028. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

References

  • Canny, J. (1986). A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679-698.
  • Xu, Yifan and Xu, Weijian and Cheung, David and Tu, Zhuowen(2021). Line Segment Detection Using Transformers Without Edges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4257-4266.
  • Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381-395.
  • Jardine, B. and Jardine, N. (2010). Critical editing of early-modern astronomical diagrams. Journal for the history of astronomy, 41(3), 393-414.
  • docExtractor: An off-the-shelf historical document element extraction. In 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR) (pp. 91-96). IEEE.
  • Zhang, X., Su, Y., Tripathi, S. and Tu, Z. (2022). Text spotting transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9519-9528.

BibTeX

@inproceedings{kalleli2023eida, 
            title={{EIDA: Editing and analysing historical astronomical diagrams with artificial intelligence}}, 
            author={Kalleli, Syrine and Trigg, Scott and Albouy, Ségolène and Guessner, Samuel and Husson, Mathieu and Aubry, Mathieu}, 
            booktitle={IAMAHA}, 
            year={2023}}