Giter Club home page Giter Club logo

stylegan's People

Contributors

loparcog avatar mercuriumvi avatar roxanne470 avatar rutheniumvi avatar u1timecia avatar

Stargazers

 avatar

Watchers

 avatar

stylegan's Issues

Initialize Data Code

Task Overview

This task will aim at making the data code along with the example data pulling. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.

From the Notebook

From the notebook this task will need everything from the "Load in the Data" section (excluding the text and GCS_PATH cell)

From You

Change any occurrences of the lines str(GCS_PATH + '/<path>') with imgtata/<path>

Ideal Output

Will need to be done with data files, so feel free to just submit when done

Create Database Image Pulling Module

Task Overview

Create a module that can take a database as input and load the images to be used as input in training the GAN. This module is necessary for all machine learning code, and likewise can be found across the internet from GitHub repos to Kaggle notebooks to the code for the textbook I've been pulling readings from. For building the model, however, it would be smart to focus on this Kaggle notebook, which actually has everything we're doing in it.

From the Notebook

Nothing is needed from the notebook

From You

This notebook is in Kaggle, meaning it references and pulls files from Kaggle. Since we are doing this in our own repository, we would need to pull the files onto our machine. To do this, you will need to:

  • Make a new folder called "imgdata" (help here)
    • If the folder already exists, assume the data has been downloaded and stop the process
  • Download the images from the databases into the new folder (Databases are going to be from Kaggle so you will need to use the Kaggle API found here, and download the database named gan-getting-started (help here)
  • Unzip the downloaded database folder (help here)

Ideal Output

By the end of the task, we should be able to download the database code to a folder called 'imgdata' and unzip it in the folder

Wasserstein Training

What is it

Wasserstein GANs are essentially an addition to regular GANs, providing a new way to train models that improves stability and possibly image output. A good article on the topic can be found at the machinelearningmastery website.

What does it change

Implementing WGAN features into our base GAN would involve changing the loss function and possibly some training settings.

Goals to see

Hopefully this would produce better and more consistent/stable results when training, meaning the loss functions are steadily reducing rather than bouncing or increasing for substantial amounts of time

Model Editing

Any tasks for altering the base model to look for better solutions

Find Appropriate DB for Content

This would be a database of images that we would pass through the system as "content" images. The best case for this would be some real-life photography. This could be of landscapes, of cityscapes, ideally images with a good amount of subjects in them to be able to identify content in the images and if it is withheld. A database of cat images would also work.

Convolutional Discriminator

What is it

We are already utilizing a deep convolutional neural network (CNN), described in depth in this article. It uses multiple convolutional layers to create a model which can identify images better than a basic model, like the one we are currently using. Many architectures are available, as mentioned at the end of the article, and which one to use is up to you (many of them are explained in this separate article with links to papers.

What does it change

This would change the discriminator to be a higher-level image identifying model.

Goals to see

Hopefully the discriminator would become stronger, and would then be able to help create a better generator by being more strict on what it does and doesn't let pass as a real image.

Model Coding

Any initial project coding to do before training

Textbook Style Transfer Loss

What is it

The style transfer model given in the textbook example in our teams chat uses a triple-section loss function, analyzing style, content, and "realness" of the generated image. This would be more complex than the current loss function we are using, and may produce better results.

What does it change

This would change training, specifically the loss functions, and maybe the discriminator but I don't think it needs to

Goals to see

This would hopefully provide better feedback to the generator, allowing it to learn and grow faster/better than the base model

Initialize Loss Function and Training Code

Task Overview

This task will aim at making the loss function and training code. There are not many additional sections, so this should just be a simple copy and paste of the given sections. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.

From the Notebook

From the notebook this task will need everything from the "Train the CycleGAN" and "Visualize our Monet-esque photos" sections

From You

Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base

Ideal Output

By the end of the task, ideally we would be able to train our model, but this will, only be possible once all other tasks are completed, so just having the code is alright

Initialize Generator Code

Task Overview

This task will aim at making the generator code and anything it may use, including the downsampling and upsampling processes. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.

From the Notebook

From the notebook this task will need everything from the "Build the Generator" section

From You

Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base

Ideal Output

By the end of the task, all cells should run without error

Report document creation

Getting a LaTeX document which can be edited by anyone to report the findings of their task, contributing to the final report.

Initialize Main Jupyter Notebook

We will need a main notebook to work in, and will need the following sections (separated by text titles for each section):

  • Style GAN (Will be used to explain notebook and code in it)
  • Data Loading (Loading data from the database)
  • Model Initialization
  • Model Training
  • Results

TPU Code Adjustment

Currently, the TPU available in Colab does not work with our code due to issues using local files. The example notebook uses a KaggleDataset package that seems native to Kaggle notebooks only, so some work will need to be done to make these local files accessible in the same way, and allow the TPU to parse and work with them.

Decide on Style Images

For the style transfer GAN, we will need to base it off of a dataset of images with the same style. This could easily be see through databases of paintings by a certain artist (Van Gogh, Monet), but could also be done with more abstract image databases that have images that all follow the same "style".

Replay Buffer Training

What is it

Replay buffers alter training of a model to avoid overfitting the model for the given data by keeping the most recently generated images from the past training iterations, and tried to fit the model with both new generated images as well as old ones.

What does it change

This would change training pretty majorly, implementing essentially a repeater to feed back data that has been generated to then let the model learn off of.

Goals to see

To be honest, I really want to try this to have something to add to the discussion, as seen in this post stating that it does not show any change, and honestly, it might not, but that's report material right there. Ideally though, this would help to avoid overfitting of the model.

Deep Convolutional GAN

What is it

A deep convolution GAN is much like a regular GAN, but instead of the current downsampling to a convolutional layer and upsampling we are currently using, it uses multiple convolution layers to learn and alter the images. These layers could be on their own, or in place of the single convolutional layer we currently have. A code notebook can be seen in this Tensorflow Notebook

What does it change

This would change both the generator and the discriminator of our model. Note that the generator in the sample above only takes noise as input, so that would also need to be edited to take images as input

Goals to see

This would hopefully give us better results, but it might be at the cost of training time.

Find Appropriate DB for Chosen Style (Monet Paintings)

Once a style has been chosen, we will need an appropriate dataset to train off of for that given style. This dataset should include all images matching the style chosen (so for example, if we chose Van Gogh as our style, we would look for a database of all of Van Gogh's paintings)

Initialize Discriminator and Model Code

Task Overview

This task will aim at making the discriminator and model code. There are not many additional sections, so this should just be a simple copy and paste of the given sections. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.

From the Notebook

From the notebook this task will need everything from the "Build the discriminator" and "Build the CycleGAN model" sections

From You

Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base

Ideal Output

By the end of the task, all cells should run without error (IF the generator and image pulling code is already done, otherwise it will throw an error)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.