Giter Club home page Giter Club logo

3dgan-release's Introduction

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

This repository contains pre-trained models and sampling code for the 3D Generative Adversarial Network (3D-GAN) presented at NIPS 2016.

http://3dgan.csail.mit.edu

Prerequisites

Torch

We use Torch 7 (http://torch.ch) for our implementation with these additional packages:

Visualization

  • Basic visualization: MATLAB (tested on R2016b)
  • Advanced visualization: Python 2.7 with package numpy, matplotlib, scipy and vtk (version 5.10.1)

Note: for advanced visualization, the version of vtk has to be 5.10.1, not above. It is available in the package list of common Python distributions like Anaconda

Installation

Our current release has been tested on Ubuntu 14.04.

Cloning the repository

git clone [email protected]:zck119/3dgan-release.git
cd 3dgan-release

Downloading pretrained models

For CPU (947 MB):

./download_models_cpu.sh

For GPU (618 MB):

./download_models_gpu.sh

Downloading latent vector inputs for demo

./download_demo_inputs.sh

Guide

Synthesizing shapes (main.lua)

We show how to synthesize shapes with our pre-trained models. The file (main.lua) has the following options.

  • -gpu ID: GPU ID (starting from 1). Set to 0 to use CPU only.
  • -class CLASSNAME: synthesize shapes for the class CLASSNAME. We currently support five classes (car, chair, desk, gun, and sofa). Use all to generate shapes for each class.
  • -sample: whether to sample input latent vectors from an i.i.d. uniform distribution, or to generate shapes with demo vectors loaded from ./demo_inputs/CLASSNAME.mat
  • -bs BATCH_SIZE: use batch size of BATCH_SIZE during network forwarding
  • -ss SAMPLE_SIZE: set the number of generated shapes to SAMPLE_SIZE. This option is only available in -sample mode.

Usages include

  • Synthesize chairs with pre-sampled demo inputs and a CPU
th main.lua -gpu 0 -class chair 
  • Randomly sample 150 desks with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class desk -bs 50 -sample -ss 150 
  • Randomly sample 150 shapes of each category with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class all -bs 50 -sample -ss 150 

The output is saved under folder ./output, with class_name_demo.mat for shapes generated by predetermined demo inputs (z in our paper), and class_name_sample.mat for randomly sampled 3D shapes. The variable inputs in the .mat file correponds to the input latent representation, and the variable voxels corresponds to the generated 3D shapes by our network.

Visualization

We offer two ways of visualizing results, one in MATLAB and the other in Python. We used the Python visualization in our paper. The MATLAB visualization is easier to install and run, but its output has a lower quality compared with the Python one.

MATLAB: Please use the function visualization/matlab/visualize.m for visualization. The MATLAB code allows users to either display rendered objects or save them as images. The script also supports downsampling and thresholding for faster rendering. The color of voxels represents the confidence value.

Options include

  • inputfile: the .mat file that saves the voxel matrices
  • indices: the indices of objects in the inputfile that should be rendered. The default value is 0, which stands for rendering all objects.
  • step (s): downsample objects via a max pooling of step s for efficiency. The default value is 4 (64 x 64 x 64 -> 16 x 16 x 16).
  • threshold: voxels with confidence lower than the threshold are not displayed
  • outputprefix:
    • when not specified, Matlab shows figures directly.
    • when specified, Matlab stores rendered images of objects at outputprefix_%i.bmp, where %i is the index of objects

Usage (after running th main.lua -gpu 0 -class chair, in MATLAB, in folder visualization/matlab):

visualize('../../output/chair_demo.mat', 0, 2, 0.1, 'chair')

The visualization might take a while. The obtained rendering (chair_1/3/4/5.bmp) should look as follows.

Python: Options for the Python visualization include

  • -t THRESHOLD: voxels with confidence lower than the threshold are not displayed. The default value is 0.1.
  • -i ID: the index of objects in the inputfile that should be rendered (one based). The default value is 1.
  • -df STEPSIZE: downsample objects via a max pooling of step STEPSIZE for efficiency. Currently supporting STEPSIZE 1, 2, and 4. The default value is 1 (i.e. no downsampling).
  • -dm METHOD: downsample method, where mean stands for average pooling and max for max pooling. The default is max pooling.
  • -u BLOCK_SIZE: set the size of the voxels to BLOCK_SIZE. The default value is 0.9.
  • -cm: whether to use a colormap to represent voxel occupancy, or to use a uniform color
  • -mc DISTANCE: whether to keep only the maximal connected component, where voxels of distance no larger than DISTANCE are considered connected. Set to 0 to disable this function. The default value is 3.

Usage:

python visualize.py chair_demo.mat -u 0.9 -t 0.1 -i 1 -mc 2

Reference

@inproceedings{3dgan,
  title={{Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling}},
  author={Wu, Jiajun and Zhang, Chengkai and Xue, Tianfan and Freeman, William T and Tenenbaum, Joshua B},
  booktitle={Advances In Neural Information Processing Systems},
  pages={82--90},
  year={2016}
}

For any questions, please contact Jiajun Wu ([email protected]) and Chengkai Zhang ([email protected]).

3dgan-release's People

Contributors

jiajunwu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3dgan-release's Issues

Simple question

Hello,

A naive question: what is th in the command? What software I need to install?

th main.lua -gpu 0 -class chair

module 'nn' not found

I tried running the code but its showing this error

> 	no field package.preload['nn']
	no file './nn.lua'
	no file '/usr/share/luajit-2.1.0-beta3/nn.lua'
	no file '/usr/local/share/lua/5.1/nn.lua'
	no file '/usr/local/share/lua/5.1/nn/init.lua'
	no file '/usr/share/lua/5.1/nn.lua'
	no file '/usr/share/lua/5.1/nn/init.lua'
	no file './nn.so'
	no file '/usr/local/lib/lua/5.1/nn.so'
	no file '/usr/lib/x86_64-linux-gnu/lua/5.1/nn.so'
	no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
	[C]: in function 'error'
	/usr/share/lua/5.1/trepl/init.lua:389: in function 'require'
	main.lua:1: in main chunk
	[C]: in function 'dofile'
	/usr/lib/torch-trepl/th:149: in main chunk
	[C]: at 0x561c0f12c1d0

I have pytorch installed.

关于模型使用的一些问题

作者您好,我是计算机与信息学院的一名学生,因写论文需要,想用下您的模型,对于您做的模型有些疑问,因为我的毕设需要做一个系统,想问您这边是否有现成的api可供调用,或者是否有支持java springboot项目的接口可供使用,或者烦请您加一下我的联系方式QQ:1393858102,如有回复,我将不胜感激。

License?

Which license rules this code? xD

Latent Space Interpolation

Looking for clarification on latent space interpolation from your paper:

For WITHIN class interpolation (e.g. chair type1 <-> chair type2) you can use the generator network that was trained on that single object category (e.g. the chair generator network).

But for BETWEEN class latent space interpolation, am I right in assuming that you trained the network on the union of the category1 category2 datasets (e.g. cars and boats), and then use that generator network's latent vectors and 3D generated objects? So to interpolate between two distinct categories, you must train a new network on the union of those data sets?
In your paper you used cars and boats. Was this choice because that paring had better results than other pairings, and if so, do you think inter-class interpolation will work best when the 2 object categories are somewhat similar (e.g. the bounding boxes of cars and boats are basically shoe box shaped so going between them is easier than going from a car to a chair, which typically has bounding boxes that are more cube shaped so has different relative dimensions)? Or was the choice because it is arguably more semantically meaningful to go between 2 vehicle types (car-boat) than e.g. an airplane to a cup?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.