Giter Club home page Giter Club logo

n3f's Introduction

Neural Feature Fusion Fields (N3F): 3D Distillation of Self-Supervised 2D Image Representations

About

This repository contains the official implementation of the paper Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations by Vadim Tschernezki, Iro Laina, Diane Larlus and Andrea Vedaldi. Published at 3DV22 as Oral.

We provide the code for the experiments of NeRF-N3F setting. NOTE: The repository contains currently the settings for the flower dataset (see flower video above). We will add the rendering settings for the other scenes in the next days.

Abstract: We present Neural Feature Fusion Fields (N3F), a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene. Given an image feature extractor, for example pre-trained using self-supervision, N3F uses it as a teacher to learn a student network defined in 3D space. The 3D student network is similar to a neural radiance field that distills said features and can be trained with the usual differentiable rendering machinery. As a consequence, N3F is readily applicable to most neural rendering formulations, including vanilla NeRF and its extensions to complex dynamic scenes. We show that our method not only enables semantic understanding in the context of scene-specific neural fields without the use of manual labels, but also consistently improves over the self-supervised 2D baselines. This is demonstrated by considering various tasks, such as 2D object retrieval, 3D segmentation, and scene editing, in diverse sequences, including long egocentric videos in the EPIC-KITCHENS benchmark.

Getting started

Setting up the Environment

We suggest to setup the environment through conda and pip.

  1. Create and activate the specified conda anvironment.
  2. Install the required packages from requirements.txt.
conda create -n n3f python=3.8
conda activate n3f
pip install -r requirements.txt

Since we demonstrate the experiments through a jupyter notebook, you'll have to install the jupyter kernel:

conda install -c anaconda ipykernel
python -m ipykernel install --user --name=n3f

If you are getting the following error: CUDA error: no kernel image is available for execution on the device, then try installing pytorch with a different CUDA kernel, e.g.: pip install torch==1.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html.

Dataset and Pretrained Models

The dataset and pretrained models can be found on google drive.

Download both files logs.tar.gz and data.tar.gz and extract them into the main directory. The checkpoints are located in the logs directory. The data directory contains the flower scene and the features extracted with DINO for this scene and the remaining scenes shown in the paper. This allows you to train your own models if you have downloaded the NeRF checkpoints and datasets for the remaining scenes.

Reproducing results

We are providing a notebook that contains the code to reproduce the results for the flower scene. The other scenes will be added in the next days.

Decomposed rendering

First, you can visualise the selected patch and calculate a histogram for the query feature vector vs. the retrieval vectors. This allows you to select a threshold for the scene decomposition in the next step.

After that, you can render the source view and render the decomposed target view that shows the complete image, a version that includes only the queried object and another version that excludes the queried object.

Comparison with DINO

Finally, we can also compare the PCA reduced futures and feature distance maps of NeRF-N3F + DINO vs. vanilla DINO:

Citation

If you found our code or paper useful, then please cite our work as follows.

@inproceedings{tschernezki22neural,
  author     = {Vadim Tschernezki and Iro Laina and
                Diane Larlus and Andrea Vedaldi},
  booktitle  = {Proceedings of the International Conference
                on {3D} Vision (3DV)},
  title      = {Neural Feature Fusion Fields: {3D} Distillation
                of Self-Supervised {2D} Image Representations},
  year       = {2022}
}

Concurrent work

We suggest to check out the concurrent work by Kobayashi et al. They propose to fuse features in the same manner and mainly differ in the example applications, including the use of multiple modalities, such as text, image patches and point-and-click seeds, to generate queries for segmentation and, in particular, scene editing.

Acknowledgements

Our implementation is based on this (unofficial pytorch-NeRF) repository.

n3f's People

Contributors

dichotomies avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.