Giter Club home page Giter Club logo

materialpalette's Introduction

Material Palette: Extraction of Materials from a Single Image (CVPR 2024)

Ivan Lopes1   Fabio Pizzati2   Raoul de Charette1
1 Inria, 2 Oxford Uni.

Project page paper cvf dataset star

TL;DR, Material Palette extracts a palette of PBR materials -
albedo, normals, and roughness - from a single real-world image.

teaser.mp4

🚨 Todo

  • 3D rendering script.

Overview

This is the official repository of Material Palette. In a nutshell, the method works in three stages: first, concepts are extracted from an input image based on a user-provided mask; then, those concepts are used to generate texture images; finally, the generations are decomposed into SVBRDF maps (albedo, normals, and roughness). Visit our project page or consult our paper for more details!

pipeline

Content: This repository allows the extraction of texture concepts from image and region mask sets. It also allows generation at different resolutions. Finally, it proposes a decomposition step thanks to our decomposition model, for which we share the training weights.

Tip

We propose a "Quick Start" section: before diving straight into the full pipeline, we share four pretrained concepts ⚡ so you can go ahead and experiment with the texture generation step of the method: see "§ Generation". Then you can try out the full method with your own image and masks = concept learning + generation + decomposition, see "§ Complete Pipeline".

1. Installation

  1. Download the source code with git

    git clone https://github.com/astra-vision/MaterialPalette.git
    

    The repo can also be downloaded as a zip here.

  2. Create a conda environment with the dependencies.

    conda env create --verbose -f deps.yml
    

    This repo was tested with Python 3.10.8, PyTorch 1.13, diffusers 0.19.3, peft 0.5, and PyTorch Lightning 1.8.3.

  3. Load the conda environment:

    conda activate matpal
    
  4. If you are looking to perform decomposition, download our pre-trained model and untar the archive:

    wget https://github.com/astra-vision/MaterialPalette/releases/download/weights/model.tar.gz
    

    This is not required if you are only looking to perform texture extraction

2. Quick start

Here are instructions to get you started using Material Palette. First, we provide some optimized concepts so you can experiment with the generation pipeline. We also show how to run the method on user-selected images and masks (concept learning + generation + decomposition)

§ Generation

Input image 1K 2K 4K 8K ⬇️ LoRA ~8Kb
J J J J J x
J J J J J x
J J J J J x
J J J J J x

All generations were downscaled for memory constraints.

Go ahead and download one of the above LoRA concept checkpoints, example for "blue_tiles":

wget https://github.com/astra-vision/MaterialPalette/files/14601640/blue_tiles.zip;
unzip blue_tiles.zip

To generate from a checkpoint, use the concept module either via the command line interface or the functional interface in python:

  • python concept/infer.py path/to/LoRA/checkpoint
    
  • import concept
    concept.infer(path_to_LoRA_checkpoint)
    

Results will be placed relative to the checkpoint directory in a outputs folder.

You have control over the following parameters:

  • stitch_mode: concatenation, average, or weighted average (default);
  • resolution: the output resolution of the generated texture;
  • prompt: one of the four prompt templates:
    • "p1": "top view realistic texture of S*",
    • "p2": "top view realistic S* texture",
    • "p3": "high resolution realistic S* texture in top view",
    • "p4": "realistic S* texture in top view";
  • seed: inference seed when sampling noise;
  • renorm: whether or not to renormalize the generated samples generations based on input image (this option can only be used when called from inside the pipeline, ie. when the input image is available);
  • num_inference_steps: number of denoising steps.

A complete list of parameters can be viewed with python concept/infer.py --help

§ Complete Pipeline

We provide an example (input image with user masks used for the pipeline figure). You can download it here: mansion.zip (credits photograph: Max Rahubovskiy).

To help you get started with your own images, you should follow this simple data structure: one folder per inverted image, inside should be the input image (.jpg, .jpeg, or .png) and a subdirectory named masks containing the different region masks as .png (these must all have the same aspect ratio as the RGB image). Here is an overview of our mansion example:

├── masks/
│ ├── wood.png
│ ├── grass.png
│ └── stone.png
└── mansion.jpg
region mask overlay generation albedo normals roughness
#6C8EBF J J J J J J
#EDB01A J J J J J J
#AA4A44 J J J J J J

To invert and generate textures from a folder, use pipeline.py:

  • python pipeline.py path/to/folder
    

Under the hood, it uses two modules:

  1. concept, to extract and generate the texture (concept.crop, concept.invert, and concept.infer);
  2. capture, to perform the BRDF decomposition.

A minimal example is provided here:

  • ## Extract square crops from image for each of the binary masks located in <path>/masks
    regions = concept.crop(args.path)
    
    ## Iterate through regions to invert the concept and generate texture views
    for region in regions.iterdir():
        lora = concept.invert(region)
        concept.infer(lora, renorm=True)
    
    ## Construct a dataset with all generations and load pretrained decomposition model
    data = capture.get_data(predict_dir=args.path, predict_ds='sd')
    module = capture.get_inference_module(pt='model.ckpt')
    
    ## Proceed with inference on decomposition model
    decomp = Trainer(default_root_dir=args.path, accelerator='gpu', devices=1, precision=16)
    decomp.predict(module, data)
    

To view options available for the concept learning, use PYTHONPATH=. python concept/invert.py --help

Important

By default, both train_text_encoder and gradient_checkpointing are set to True. Also, this implementation does not include the LPIPS filter/ranking of the generations. The code will only output a single sample per region. You may experiment with different prompts and parameters (see "Generation" section).

3. Project structure

The pipeline.py file is the entry point to run the whole pipeline on a folder containing the input image at its root and a masks/ sub-directory containing all user defined masks. The train.py file is used to train the decomposition model. The most important files are shown here:

.
├── capture/        % Module for decomposition
│ ├── callbacks/    % Lightning trainer callbacks
│ ├── data/         % Dataset, subsets, Lightning datamodules
│ ├── render/       % 2D physics based renderer
│ ├── utils/        % Utility functions
│ └── source/       % Network, loss, and LightningModule
│   └── routine.py  % Training loop
│
└── concept/        % Module for inversion and texture generation
  ├── crop.py       % Square crop extraction from image and masks
  ├── invert.py     % Optimization code to learn the concept S*
  └── infer.py      % Inference code to generate texture from S*

If you have any questions, post via the issues tracker or contact the corresponding author.

4. (optional) Training

We provide the pre-trained decomposition weights (see "Installation"). However, if you are looking to retrain the domain adaptive model for your own purposes, we provide the code to do so. Our method relies on the training of a multi-task network on labeled (real) and unlabeled (synthetic) images, jointly. In case you wish to retrain on the same datasets, you will have to download both the AmbientCG and TexSD datasets.

First download the PBR materials (source) dataset from AmbientCG:

python capture/data/download.py path/to/target/directory

To run the training script, use:

python train.py --config=path/to/yml/config

Additional options can be found with python train.py --help.

Note

The decomposition model allows estimating the pixel-wise BRDF maps from a single texture image input.

Acknowledgments

This research project was mainly funded by the French Agence Nationale de la Recherche (ANR) as part of project SIGHT (ANR-20-CE23-0016). Fabio Pizzati was partially funded by KAUST (Grant DFR07910). Results were obtained using HPC resources from GENCI-IDRIS (Grant 2023-AD011014389).

The repository contains code taken from PEFT, SVBRDF-Estimation, DenseMTL. As for visualization, we used DeepBump and Blender. Credit to Runway for providing us all the stable-diffusion-v1-5 model weights. All images and 3D scenes used in this work have permissive licenses. Special credits to AmbientCG for the huge work.

The authors would also like to thank all members of the Astra-Vision team for their valuable feedback.

License

If you find this code useful, please cite our paper:

@inproceedings{lopes2024material,
    author = {Lopes, Ivan and Pizzati, Fabio and de Charette, Raoul},
    title = {Material Palette: Extraction of Materials from a Single Image},
    booktitle = {CVPR},
    year = {2024},
    project = {https://astra-vision.github.io/MaterialPalette/}
}

Material Palette is released under MIT License.


🢁 jump to top

materialpalette's People

Contributors

rdecharette avatar wonjunior avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

materialpalette's Issues

AssertionError in the infer.py script from Quick Start

Facing this issue after downloading blue_tiles:

(matpal) issac@issac-KVM:~/MaterialPalette$ python concept/infer.py blue_tiles/unet/adapter_model.bin --outdir OUTPUT
Namespace(path=PosixPath('blue_tiles/unet/adapter_model.bin'), outdir=PosixPath('OUTPUT'), token=None, stitch_mode='wmean', resolution=1024, prompt='p1', seed=1, renorm=False, num_inference_steps=50)
Traceback (most recent call last):
File "/home/issac/MaterialPalette/concept/infer.py", line 412, in
main(args)
File "/home/issac/miniconda3/envs/matpal/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/issac/MaterialPalette/concept/infer.py", line 114, in main
assert args.path.is_dir()
AssertionError

Generated texture is not smooth

First of all, this is a great project. Thanks for sharing.

I tried the given sample mansion.zip. However, it generated a texture like this for grass, it is not smooth and contains 4 tile-like areas. Do you have any suggestions to improve this?
azertyuiop_1K_t50_wmean_top-view-realistic-texture-of-o_1

Assertion Error when starting decomposition using Pipeline.py

I got the following error when trying the mansion.zip pipeline example provided in the readme.

Traceback (most recent call last):
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\pipeline.py", line 25, in
module = capture.get_inference_module(pt='model.ckpt')
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\capture\utils\model.py", line 41, in get_inference_module
assert Path(pt).exists()
AssertionError

I cant figure out what the model.ckpt is supposed to be referencing because there is no ckpt file in the capture directory itself.

While running pipeline.py - IndexError

"WhitePaint" cluster, q=39.57%
512x512 kept 0 patches -> /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/crops/WhitePaint
256x256 kept 0 patches -> /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/crops/WhitePaint
192x192 kept 0 patches -> /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/crops/WhitePaint
128x128 kept 4 patches -> /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/crops/WhitePaint
---- kept 2/3 crops.
04/18/2024 03:24:04 - INFO - concept.utils - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

trainable params: 589,824 || all params: 123,650,304 || trainable%: 0.4770097451600281
{'scaling_factor', 'force_upcast'} was not found in config. Values will be initialized to default values.
{'num_attention_heads', 'mid_block_only_cross_attention', 'dual_cross_attention', 'resnet_skip_time_act', 'addition_time_embed_dim', 'time_embedding_type', 'cross_attention_norm', 'class_embed_type', 'only_cross_attention', 'conv_out_kernel', 'time_embedding_dim', 'resnet_time_scale_shift', 'conv_in_kernel', 'transformer_layers_per_block', 'encoder_hid_dim_type', 'addition_embed_type', 'resnet_out_scale_factor', 'projection_class_embeddings_input_dim', 'timestep_post_act', 'time_cond_proj_dim', 'num_class_embeds', 'mid_block_type', 'encoder_hid_dim', 'class_embeddings_concat', 'time_embedding_act_fn', 'addition_embed_type_num_heads', 'upcast_attention', 'use_linear_projection'} was not found in config. Values will be initialized to default values.
trainable params: 1,594,368 || all params: 861,115,332 || trainable%: 0.18515150535027286
04/18/2024 03:24:08 - INFO - concept.utils - ***** Running training *****
04/18/2024 03:24:08 - INFO - concept.utils - Num examples = 12
04/18/2024 03:24:08 - INFO - concept.utils - Num batches each epoch = 12
04/18/2024 03:24:08 - INFO - concept.utils - Instantaneous batch size per device = 1
04/18/2024 03:24:08 - INFO - concept.utils - Total train batch size (w. parallel, distributed) = 1
04/18/2024 03:24:08 - INFO - concept.utils - Total optimization steps = 800
Steps: 100% 800/800 [05:04<00:00, 2.62it/s, loss=0.685, lr=0.0001]
loading LoRA with token azertyuiop
{'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...: 0% 0/6 [00:00<?, ?it/s]Loaded feature_extractor as CLIPImageProcessor from feature_extractor subfolder of runwayml/stable-diffusion-v1-5.
Loaded text_encoder as CLIPTextModel from text_encoder subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 33% 2/6 [00:00<00:00, 7.29it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from scheduler subfolder of runwayml/stable-diffusion-v1-5.
{'scaling_factor', 'force_upcast'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from vae subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 67% 4/6 [00:00<00:00, 8.02it/s]Loaded tokenizer as CLIPTokenizer from tokenizer subfolder of runwayml/stable-diffusion-v1-5.
{'num_attention_heads', 'mid_block_only_cross_attention', 'dual_cross_attention', 'resnet_skip_time_act', 'addition_time_embed_dim', 'time_embedding_type', 'cross_attention_norm', 'class_embed_type', 'only_cross_attention', 'conv_out_kernel', 'time_embedding_dim', 'resnet_time_scale_shift', 'conv_in_kernel', 'transformer_layers_per_block', 'encoder_hid_dim_type', 'addition_embed_type', 'resnet_out_scale_factor', 'projection_class_embeddings_input_dim', 'timestep_post_act', 'time_cond_proj_dim', 'num_class_embeds', 'mid_block_type', 'encoder_hid_dim', 'class_embeddings_concat', 'time_embedding_act_fn', 'addition_embed_type_num_heads', 'upcast_attention', 'use_linear_projection'} was not found in config. Values will be initialized to default values.
Loaded unet as UNet2DConditionModel from unet subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100% 6/6 [00:01<00:00, 4.90it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
p1 => top view realistic texture of {}
ignoring args.outdir and using path /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/weights/Pebbles/an_object_with_azertyuiop_texture/checkpoint-800/outputs
preparing for /content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/my_images/weights/Pebbles/an_object_with_azertyuiop_texture/checkpoint-800/outputs/azertyuiop_1K_t50_wmean_top-view-realistic-texture-of-o_1.png
100% 50/50 [00:10<00:00, 4.82it/s]
Traceback (most recent call last):
File "/content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/pipeline.py", line 21, in
concept.infer(lora, renorm=True)
File "/content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/concept/infer.py", line 398, in infer
return main(Namespace(
File "/usr/local/envs/matpal/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/concept/infer.py", line 393, in main
renorm(fname)
File "/content/gdrive/MyDrive/MIT/SEM02/ComputationDesignLab/MaterialPalette/concept/renorm.py", line 40, in renorm
low_threshold = sorted_pixels[exclude_count]
IndexError: index 0 is out of bounds for dimension 0 with size 0

good job

hello, baby! can this model work best on portrait dataset?
you know, geowizard can generate sota normal map and depth map.
Whether your project can process portrait dataset well?

how can I test my single img use pipeline.py

hello, the following is my structure:
企业微信截图_17225949167199
I put only one single portrait img at the directory img.
I modify the code at pipeline.py, I only need to test your model.ckpt, decopose the A N R .
企业微信截图_17225950507884

I think I need your help, meet the following problem:
Uploading 企业微信截图_17225951578875.png…

Query on open-sourcing plans for your exceptional work

Hello, I'm a student from SJTU. I'm deeply impressed by your exceptional work on materials. I wonder if you have any plans to release the source code as open-source in the future and I am really looking forward to it.

Texture Size

I saw in your paper that you had several different resolutions the texture could be scaled to. What is the average size per texture (kb or mb) at each resolution of 1024, 2048, 4096, and 8192?

Data Directory missing from the Capture Directory

As explained in the Project Structure of your README, the data directory inside the capture directory holds essential files for running the complete pipeline. Since it is missing, attempts at running pipeline.py fail with the following error while importing modules:

python pipeline.py G:\SAGNIK_Material_Palette\Material_Palette\mansion
Traceback (most recent call last):
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\pipeline.py", line 7, in
import capture
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\capture_init_.py", line 2, in
from .utils.model import get_inference_module
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\capture\utils_init_.py", line 3, in
from .cli import get_args
File "G:\SAGNIK_Material_Palette\Material_Palette\MaterialPalette\capture\utils\cli.py", line 11, in
from ..data.module import DataModule
ModuleNotFoundError: No module named 'capture.data'

Is there anything on my end that can be done to solve this or is there an update slated to the repo itself that takes care of this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.