Giter Club home page Giter Club logo

acorn's Introduction

ACORN: Adaptive Coordinate Networks for Neural Scene Representation
SIGGRAPH 2021

PyTorch implementation of ACORN.
ACORN: Adaptive Coordinate Networks for Neural Scene Representation
Julien N. P. Martel*, David B. Lindell*, Connor Z. Lin, Eric R. Chan, Marco Monteiro, Gordon Wetzstein
Stanford University
*denotes equal contribution
in SIGGRAPH 2021

Quickstart

To setup a conda environment, download example training data, begin the training process, and launch Tensorboard, follow the below commands. As part of this you will also need to register for and install an academic license for the Gurobi optimizer (this is free for academic use).

conda env create -f environment.yml
# before proceeding, install Gurobi optimizer license (see above web link)
conda activate acorn 
cd inside_mesh
python setup.py build_ext --inplace
cd ../experiment_scripts
python train_img.py --config ./config_img/config_pluto_acorn_1k.ini
tensorboard --logdir=../logs --port=6006

This example will fit 1 MP image of Pluto. You can monitor the training in your browser at localhost:6006.

Adaptive Coordinate Networks

An adaptive coordinate network learns an adaptive decomposition of the signal domain, allowing the network to fit signals faster and more accurately. We demonstrate using ACORN to fit large-scale images and detailed 3D occupancy fields.

Datasets

Image and 3D model datasets should be downloaded and placed in the data directory. The datasets used in the paper can be accessed as follows.

Training

To use ACORN, first set up the conda environment and build the Cython extension with

conda env create -f environment.yml
conda activate acorn 
cd inside_mesh
python setup.py build_ext --inplace

Then, download the datasets to the data folder.

We use Gurobi to perform solve the integer linear program used in the optimization. A free academic license can be installed from this link.

To train image representations, use the config files in the experiment_scripts/config_img folder. For example, to train on the Pluto image, run the following

python train_img.py --config ./config_img/config_pluto_1k.ini
tensorboard --logdir=../logs/ --port=6006

After the image representation has been trained, the decomposition and images can be exported using the following command.

python train_img.py --config ../logs/<experiment_name>/config.ini --resume ../logs/<experiment_name> <iteration #> --eval

Exported images will appear in the ../logs/<experiment_name>/eval folder, where <experiment_name> is the subdirectory in the log folder corresponding to the particular training run.

To train 3D models, download the datasets, and then use the corresponding config file in experiment_scripts/config_occupancy. For example, a small model representing the Lucy statue can be trained with

python train_occupancy.py --config ./config_occupancy/config_lucy_small_acorn.ini

Then a mesh of the final model can be exported with

python train_occupancy.py --config ../logs/<experiment_name>/config.ini --load ../logs/<experiment_name> --export

This will create a .dae mesh file in the ../logs/<experiment_name> folder.

Citation

@article{martel2021acorn,
  title={ACORN: {Adaptive} coordinate networks for neural scene representation},
  author={Julien N. P. Martel and David B. Lindell and Connor Z. Lin and Eric R. Chan and Marco Monteiro and Gordon Wetzstein},
  journal={ACM Trans. Graph. (SIGGRAPH)},
  volume={40},
  number={4},
  year={2021},
}

Acknowledgments

We include the MIT licensed inside_mesh code in this repo from Lars Mescheder, Michael Oechsle, Michael Niemeyer, Andreas Geiger, and Sebastian Nowozin, which is originally included in their Occupancy Networks repository.

J.N.P. Martel was supported by a Swiss National Foundation (SNF) Fellowship (P2EZP2 181817). C.Z. Lin was supported by a David Cheriton Stanford Graduate Fellowship. G.W. was supported by an Okawa Research Grant, a Sloan Fellowship, and a PECASE by the ARO. Other funding for the project was provided by NSF (award numbers 1553333 and 1839974).

Errata

  • The 3D shape fitting metrics were reported in the paper as calculated using the Chamfer-L1 distance. The metric should have been labeled Chamfer-L2, which is consistent with the implementation in this repository.

acorn's People

Contributors

davelindell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

acorn's Issues

Flat outputs result for demo 2D representation

Hi, I am trying to run ACORN for demo 2D representation, after several hours of training for pluto.jpg, I got results as follows,
individualImage
The left image is ground truth, the right one is predicted images, outputs from official ACORN.

And I also implement my personal version of ACORN according to the paper with a simple version of Octree, I meet the same issue, the output result is just a flat image.
The loss is not decent either,
Screenshot from 2021-08-18 11-56-16

Do you have any idea about this? Moreover, I also wonder that how to propagate gradient by interpolation function such as grid_sample? It seems non-differentiable (i.e., it is a function of position or coordinate).

features_out = torch.nn.functional.grid_sample(features_in, sample_coords,
                                                       mode='bilinear',
                                                       padding_mode='border',
                                                       align_corners=True).reshape(b_size, n_channels, np.prod(self.patch_size))

Any suggestion or idea for indicating the problems?

Train occupancy on point cloud

Hi!

Is it possible to train occupancy model on point cloud? As I can see, training heavily relies on OccupancyDataset which needs a mesh to evaluate occupancy of query points. Is there any workaround to use point clouds or this model can be used for meshes only?

o attribute 'num_scales'

dataio.py", line 493, in getitem
scales = 2*scales / (self.num_scales-1) - 1
AttributeError: 'Block3DWrapperMultiscaleAdaptive' object has no attribute 'num_scales'

Tokyo/mars gigapixel too large, process killed

Hi,

I've tried setting up the ACRON environment to train the Tokyo gigapixel ("tokyo.tif") but I cannot get the process to train after the data loader finished 50%. Even I changed the PIL.Image.MAX_IMAGE_PIXELS = none , the process still gets killed due to large memory use. I wonder if you have ever encountered this issue and would like to know how you deal with such a large image without exceeding the memory.

Thanks in advance.

Training speed

Hi, I'm training a 3d model (engine) by your code. And I completely followed the steps in README. But the code runs too slow (more than 1000 hours to finish). So where the problem is ?
(I used a GeForce RTX 2080 Ti)

Error of merging and splitting blocks

Thanks for your great job. The performance is solid and consistent in my own experiments. But I still have some confusion about the calculation of "error of merging and splitting blocks" in your code and paper.

  1. In the paper, the error of merging blocks is Ns times the current error if parent's error is unavailable. However, all the siblings will contribute Ns times error if merging, which means the error of merging blocks will be Ns^2 compared with the current error in total. So, how could the error be consistent across scales?
  2. In the paper, the error of splitting blocks is the sum of children's errors if children's error is available. But in your code, it seems to be Ns times of the sum of children's errors as shown below.
err_children = np.sum([child.err for child in self.children])
err_split = area * err_children    # here, area is parent's area
  1. Why the block volume is still block_size**2 in OctTree in the code?

Hope you can correct my understanding of this work. Thank you very much!

How to choose res/patch size for new images

Hi,

Currently, I want to run ACORN for several new images. We encountered an issue with this image
reference

that has '3250 x 4333' res. Since 3250 and 4333 has no common factor, I don't know how to choose patch size for that. I tried to resize it with the following config
image

but during the training, we encountered an unexpected error related to tensor shapes:

image

Any idea why this happened? Thanks in advance.

Metrics

Hi, thanks for your great work!
Could you please release your code of metrics computation? Especially I want to know how many points you used to compute chamfer distance?

About the number of coords sampled within each block.

Hi, I have checked your paper and codes, and find that your sampling strategy is different from that in SIREN. You only sample nl points within each block. But I do not find the exact number of this nl. Could you please provide more explanations?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.