Giter Club home page Giter Club logo

diffusart-pytorch's Introduction

Diffusart-pytorch

1. About the project

The project's goal is to develop a neural network that given a sketch of an image and some partial information about the image can generate a full-colour image.

We are using a medium sized dataset of around 40000 images, taken from various anime (see Dataset ). The main source of inspiration (architecture and approach) was the paper Diffusart: Enhancing Line Art Colorization with Conditional Diffusion Models . The architecture was a direct copy of the implementation of the paper by Ho et al. , throught the code implementation by Niels Rogge and Kashif Rasul .

2. How to run the code

The main training loop is under train.py, and all of the config files are in the config.yaml. The dataset that is to be used for the task is assumed to have three columns where each row represents a training triplet. The columns are "full_colour", "sketch", "sketch_and_scribbles_merged". Look at Dataset , for an example training dataset.

3. Overview of the model

The model uses an Unet architecture. The explicit conditional information is concatenated to the noisy input, and the implicit partial colour information is introduced via cross-attention.

4. Evaluation

output.mp4

Training of the model took around 80 hours on a single RTX 3090 chip. The average LPIPS score on the test set (300 examples) was measured to be 0.1632 after sampling with 100 DDPM steps.

5. HuggingFace space

See link to the HuggingFace space for a demo of the model.

5. Notes

This is a project that was done as part of the "Theory And Practice of Deep Learning" undergraduate course at Yonsei University. This is still just a rough implementation of the ideas mentioned in the original Diffusart paper, and as such, it may contain some bugs and errors. You are welcome to propose any changes via GitHub.

diffusart-pytorch's People

Contributors

pawelpiwowarski avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.