Synchronization in Computer Vision

Many of the computer vision problems involve processing multiple entities be it objects, shapes, views or scenes. In such cases, graphs a.k.a. networks, are the key data structures for storing and organizing information. Yet, relationships between individual entities that had to be encoded in the edges often remain pairwise, or rather local. One of the most well accepted methods of seeking a global agreement is enforcing cycle-consistency, where the local errors are distributed over the entire graph such that the composition of maps/transforms along the cycles is close to the identity map. This art of consistently recovering absolute quantities from a collection of ratios is known as synchronization. From training generative adverserial networks to geometric structure from motion algorithms, from temporal video understanding to image-to-image translation, this capability of imposing consistency benefits a wide variety of vision tasks. In this tutorial, we first introduce the fundamentals of cycle-consistency and review the broad range of studies that make use of it. Next, we cover different techniques for solving multiview synchronization problems in computer vision, or in other words for achieving cycle consistency. Several techniques including graph theory, combinatorial optimization, Riemannian geometry, spectral decomposition, (non-)convex optimization, and MAP inference will be addressed. We also touch upon recent techniques that jointly optimize neural networks across multiple domains. Besides optimization techniques, we will also discuss the uncertainty and ambiguities inherent either in the data or in the model and show how the existing tools can be augmented to yield this valuable piece of information. We will finally showcase the applications of synchronizing linear/non-linear maps (e.g. functional maps) in multi-view based geometry reconstruction (RGB images or RGBD images), joint analysis of image collections, 3D reconstruction and understanding across multiple domains. This projects targets developing tools and methods acting as common techniques across several sub-fields of computer vision such as multi-view structure-from-motion, 3D geometry reconstruction, unsupervised map/object discovery, joint learning of neural networks and end-to-end multiview processing.

2022

Multiway non-rigid point cloud registration via learned functional map synchronization

Jiahui Huang, Tolga Birdal, Zan Gojcic, and 2 more authors

IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2022

arXiv Code Website

gojcic2020learning

2021

Multibodysync: Multi-body segmentation and motion estimation via 3d scan synchronization

Jiahui Huang, He Wang, Tolga Birdal, and 4 more authors

In IEEE Conf. Computer Vision Pattern Recognition (CVPR), 2021

arXiv Video Code

2021

Quantum permutation synchronization

Tolga Birdal, Vladislav Golyanik, Christian Theobalt, and 1 more author

In IEEE Conf. Computer Vision Pattern Recognition (CVPR), 2021

arXiv Website

2020

Learning multiview 3d point cloud registration

Zan Gojcic, Caifa Zhou, Jan D Wegner, and 2 more authors

In IEEE Conf. Computer Vision Pattern Recognition (CVPR), 2020

arXiv Code

2020

Synchronizing probability measures on rotations via optimal transport

Tolga Birdal, Michael Arbel, Umut Simsekli, and 1 more author

In IEEE Conf. Computer Vision Pattern Recognition (CVPR), 2020

arXiv Code Website

2019

Probabilistic permutation synchronization using the riemannian structure of the birkhoff polytope

Tolga Birdal, and Umut Simsekli

In IEEE Conf. Computer Vision Pattern Recognition (CVPR), 2019

Awarded arXiv Website

Best paper finalist

2018

Bayesian pose graph optimization via bingham distributions and tempered geodesic mcmc

Tolga Birdal, Umut Şimşekli, M Onur Eken, and 1 more author

In Adv. Neural Inform. Process. Systems (NeurIPS), 2018

arXiv Website

Related Publications

2022

2021

2021

2020

2020

2019

2018