Published in IEEE International Conference on Computer Vision and Pattern Recognition, June 2019

Synthesizing 3D Shapes from Silhouette Image Collections using Multi-Projection Generative Adversarial Networks

¹University of Science and Technology of China
²Microsof Research Asia
³ College of William & Mary

Results generated by VP-MP-GAN trained on the bird dataset

Abstract

We present a new weakly supervised learning-based method for generating novel category-specific 3D shapes from unoccluded image collections. Our method is weakly supervised and only requires silhouette annotations from unoccluded, category-specific objects. Our method does not require access to the object's 3D shape, multiple observations per object from different views, intra-image pixel-correspondences, or any view annotations. Key to our method is a novel multi-projection generative adversarial network (MP-GAN) that trains a 3D shape generator to be consistent with multiple 2D projections of the 3D shapes, and without direct access to these 3D shapes. This is achieved through multiple discriminators that encode the distribution of 2D projections of the 3D shapes seen from a different views. Additionally, to determine the view information for each silhouette image, we also train a view prediction network on visualizations of 3D shapes synthesized by the generator. We iteratively alternate between training the generator and training the view prediction network. We validate our multi-projection GAN on both synthetic and real image datasets. Furthermore, we also show that multi-projection GANs can aid in learning other high-dimensional distributions from lower dimensional training datasets, such as material-class specific spatially varying reflectance properties from images.

Keywords

GAN, Generative Adversarial Networks, Multi-Projection

Paper and video

Paper .pdf | 6.2 MB

Trained model and code

GitHub Repo Code and model

Acknowledgements

We would like to thank the reviewers for their constructive feedback. We also thank Baining Guo for discussions and suggestions. Pieter Peers was partially supported by NSF grant IIS-1350323 and gifts from Google, Activision, and Nvidia.