Artificial Intelligence Researchers Propose “GANgealing”: A GAN-Supervised Algorithm That Learns Transformations From Input Images To Bring Them To Better Joint Alignment

The visual alignment matching problem is a problem that computer vision algorithms have to solve for many different applications.
It is considered a critical component of optical flow, 3D matching and medical imaging, to name a few examples; this also impacts tracking and augmented reality.

The current focus of alignment research is on pairwise alignments, but less attention has been given to overall joint alignments. Yet this problem also requires a common frame of reference for tasks such as automatic keypoint annotation and augmented reality/editing. The emphasis on output shows that they emphasize how important this can be in certain applications. When training from jointly aligned datasets, such as FFHQ and AFHQ datasets combined with CelebA-Hq for example, generative models are more likely to produce high quality representations.

Researchers from UC Berkeley, Carnegie Mellon University, Adobe Research and MIT CSAIL propose a new algorithm called ‘GANgealing’. ‘GANgealing’ is a GAN-supervised algorithm that learns transformations of input images to bring them into better joint alignment. The research team introduced the GAN supervised learning framework to learn discriminative models and their end-to-end jointly generated training data.

The GAN supervised learning framework is a way to train both spatial transformer and target images. The model is generalizable, meaning it can also work with real-world data. The researchers showed that their “GANgealing” algorithm could successfully align complex data and discover dense matches across eight different datasets (LSUN Bicycles, Cats, Cars, Dogs, Horses and TVs, In-The-Wild CelebA, and CUB ).

According to the researchers, the proposed “GANgealing” algorithm is significantly better than older self-supervised matching algorithms and performs on par with state-of-the-art supervised matching methods. It does this without using any outside input or data augmentation, although it was trained exclusively via data generated by the GAN.




James G. Williams