Giter Club home page Giter Club logo

cm-gan-inpainting's Introduction

CM-GAN for Image Inpainting (ECCV 2022)

arXiv|pdf paper|appendix|Project

teaser teaser teaser teaser

The official repo for CM-GAN (Cascaded Modulation GAN) for Image Inpainting. We introduce a new cascaded modulation design that cascades global modulation with spatial adaptive modulation for better hole filling. We also introduce an object-aware training scheme to facilitate better object removal. CM-GAN significantly improves the existing state-of-the-art methods both qualitatively and quantitatively. The online demo will be released soon.

NEWS (07/20/2022): We plan to release the online demo and our dataset soon in the next few days.
NEWS (07/28/2022): The panoptic segmentation annotations on Places2 challange dataset are released. See here.
NEWS (07/28/2022): The evluation results of CM-GAN are released, which contains the object-aware masks for evaluation and our results. See here.
NEWS (07/31/2022): The code for object-aware mask generation is released, see here.

Method

We propose cascaded modulation GAN (CM-GAN) with a new modulation design that cascades global modulation with spatial adaptive modulation. To enable this, we also design a new spatial modulation scheme that is compatible to the state-of-the-art GANs (StyleGAN2 and StyleGAN3) with weight demodulation. We additionally propose an object-aware training scheme that generates more realistic masks to facilitate the real object removal use case. Please refer to our arXiv paper for more technical details. teaser

Comparisons

CM-GAN reconstructs better textures teaser teaser teaser teaser teaser teaser teaser

better global structure teaser teaser teaser teaser teaser teaser teaser teaser teaser teaser

and better object boundaries. teaser teaser teaser teaser teaser teaser

Results

CM-GAN achieves better FID, LPIPS, U-IDS and P-IDS scores. teaser

Dataset

Panoptic Annotations

The panoptic segmentation annotations on Places2 are released. Please refer to Dropbox folder places2_panoptic_annotation to download the panoptic segmentation annotations on train, evaluation, and test sets ([data/test/val]_large_panoptic.tar) and the corresonding file lists ([data/test/val]_large_panoptic.txt). Images of Places2-challange dataset can be downloaded at the Places2 official website.

Format of Panoptic Annotation

The panoptic annotation of each image is represented by a png image and a json file. The png image saves the id of each segment, and JSON file saves category_id, isthing of id. Isthing represents whether the segment is a thing/stuff. To know more details about the data format, please run the following python script

from detectron2.data import MetadataCatalog 
panoptic_metadata = MetadataCatalog.get('coco_2017_val_panoptic_separated')

and refer to the demo script, which provides a detailed example on how to generate object-aware masks from the panoptic annotations. The metadata panoptic_metadata is also saved at mask_generator/_panoptic_metadata.txt

Evaluation and CM-GAN Results

The evluation set for inpainting is released. Please refer to evaluation folder on Dropbox, which contains the Places evluation set images at resolution 512x512 (image.tar), the object-aware masks for all evluation images (mask.tar), and the results of CM-GAN (cmgan-perc64.tar).

Code for On-the-fly Object-aware Mask Generation

The mask_generator/mask_generator.py contains the class and example for on-the-fly object-aware mask generation. Please run

cd mask_generator
python mask_generator.py

to generate a random mask and the masked image, which are save to mask_generator/output_mask.png and mask_generator/output_masked_image.png, respectively. An visual example is shown below: mask_example Note that we use 4 object masks only for illstration and the full object mask dataset is from PriFill, ECCV'20.

Citation

Please consider cite our paper "CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training" (Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo) if you find this work useful for your research.

@article{zheng2022cmgan,
      title={CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training}, 
      author={Haitian Zheng and Zhe Lin and Jingwan Lu and Scott Cohen and Eli Shechtman and Connelly Barnes and Jianming Zhang and Ning Xu and Sohrab Amirghodsi and Jiebo Luo},
      journal={arXiv preprint arXiv:2203.11947},
      year={2022},
}

We also have another project on image manipulation. Please also feel free to cite this work if you find it interesting.

@article{zheng2020semantic,
  title={Semantic layout manipulation with high-resolution sparse attention},
  author={Zheng, Haitian and Lin, Zhe and Lu, Jingwan and Cohen, Scott and Zhang, Jianming and Xu, Ning and Luo, Jiebo},
  journal={arXiv preprint arXiv:2012.07288},
  year={2020}
}

cm-gan-inpainting's People

Contributors

htzheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cm-gan-inpainting's Issues

place dataset training

Hello, I would like to know how the Place dataset with so many categories is trained? Are the categories that are going to be trained in the same folder together for training, or are each category trained separately?

Questions about the object masks

Hi! This is an amazing work in image inpainting, and I'm really interesting the method to generate masks described in your paper. The problem is that when generating masks, object masks are first sampled. However, this repo only provides 4 object masks. Could you provide more samples and the method to construct the object masks? Thank you!

Can we have access to the pre-trained model?

Hey, I loved the paper!

Two questions:

  • Can we have access to the pre-trained model?
  • What is the estimated time, in your opinion, to run the model on a single image on a mobile device?
  • What is the recommended image resolution input size?

Thanks!

What's the relationship of json and png?

I check the pixel values in png, but it's not consistent with id or category_id in json file, even not consistent with coco dataset. So, could you release a readme.txt of the panoptic segmentation annotations, thanks!

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

<io.BufferedReader name='/home/zf/Desktop/CMGAN/mask_generator/Places365_test_00000002.jpg'>
Traceback (most recent call last):
File "/home/zf/Desktop/CMGAN/mask_generator/mask_generator.py", line 310, in
image, mask, masked_image = mask_generator(image_fname, anno_seg_fname, anno_json_fname)
File "/home/zf/Desktop/CMGAN/mask_generator/mask_generator.py", line 291, in call
image_label
, image_label_anno_ = self._load_raw_image_label(anno_seg_fname, anno_json_fname, crop_config=crop_config, panoptic=self._panoptic)
File "/home/zf/Desktop/CMGAN/mask_generator/mask_generator.py", line 175, in _load_raw_image_label
anno = json.load(f)
File "/home/zf/programs/miniconda3/envs/CMGAN/lib/python3.7/json/init.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/home/zf/programs/miniconda3/envs/CMGAN/lib/python3.7/json/init.py", line 343, in loads
s = s.decode(detect_encoding(s), 'surrogatepass')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

Lama-OT in Appendix: train from scratch or finetune from Big-Lama?

Dear authors. Thanks for the great work. Regarding the Lama-OT model in the Appendix-Section B, did you train it from scratch (so that it becomes aware of object mask right from the start), or is it possible to achieve the results in Fig.8 and Fig.9 (Appendix) by finetuning from Big-Lama?
Thank you!

training code

Hello, I have read your article and I am very interested in it. So when will the training code be released? Thanks

Use model for custom images

Hi ! Thanks a lot for this amazing work you're sharing. Is it possible to use your model on custom images and if so is there any inference code you would be able to give ? Thanks for your answer :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.