Artificial Intelligence Researchers from Huawei and Shanghai Jiao Tong University Introduce “CIPS-3D”: A 3D-Enabled GAN Generator

The StyleGAN architecture is a great way to generate high quality images, but it lacks the ability to precisely control camera poses. Recent NeRF-based generators have made progress towards creating real results in that they cannot produce photorealistic images.

Researchers from Huawei and Shanghai Jiao Tong University have developed CIPS-3D, an approach that synthesizes each pixel value independently, just like its 2D version.

The proposed generator consists of a simplified 3D NeRF shallow network to alleviate memory complexity and has the capability of deep 2D INR (implicit neural representation) networks without any spatial convolution or oversampling operations. The proposed generator design conforms to the well-known hierarchical semantic principle of GANs, where the first layers ((i.e. the shallow NeRF network in the generator) determine the pose and medium/high control ((c ie the INR network in the generator) The first NeRF network allows the research team to easily and explicitly control the pose of the camera.

CIPS-3D suffers from a mirror symmetry problem, which also exists in other 3D compatible GANs such as GIRAFFE and StyleNeRF. The research explained why this happens instead of just attributing it to data set bias. The research group solved this problem by adding an auxiliary discriminator to the network. Partial gradient backpropagation has been proposed as a training strategy to train high-resolution CIPS-3D.


Researchers validated the benefits of CIPS-3D on high-resolution face datasets, including FFHQ, MetFaces, BitmojiFaces, CartoonFaces, and an AFHQ animal dataset. Details can be found in the research paper and Github. The links are given below.



James G. Williams