mcmasterai / stylegan Goto Github PK
View Code? Open in Web Editor NEWStyle transfer GAN project for 2020-2021
Style transfer GAN project for 2020-2021
This task will aim at making the data code along with the example data pulling. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.
From the notebook this task will need everything from the "Load in the Data" section (excluding the text and GCS_PATH
cell)
Change any occurrences of the lines str(GCS_PATH + '/<path>')
with imgtata/<path>
Will need to be done with data files, so feel free to just submit when done
Create a module that can take a database as input and load the images to be used as input in training the GAN. This module is necessary for all machine learning code, and likewise can be found across the internet from GitHub repos to Kaggle notebooks to the code for the textbook I've been pulling readings from. For building the model, however, it would be smart to focus on this Kaggle notebook, which actually has everything we're doing in it.
Nothing is needed from the notebook
This notebook is in Kaggle, meaning it references and pulls files from Kaggle. Since we are doing this in our own repository, we would need to pull the files onto our machine. To do this, you will need to:
By the end of the task, we should be able to download the database code to a folder called 'imgdata' and unzip it in the folder
Wasserstein GANs are essentially an addition to regular GANs, providing a new way to train models that improves stability and possibly image output. A good article on the topic can be found at the machinelearningmastery website.
Implementing WGAN features into our base GAN would involve changing the loss function and possibly some training settings.
Hopefully this would produce better and more consistent/stable results when training, meaning the loss functions are steadily reducing rather than bouncing or increasing for substantial amounts of time
This is an example for M4
Any tasks for altering the base model to look for better solutions
This would be a database of images that we would pass through the system as "content" images. The best case for this would be some real-life photography. This could be of landscapes, of cityscapes, ideally images with a good amount of subjects in them to be able to identify content in the images and if it is withheld. A database of cat images would also work.
We are already utilizing a deep convolutional neural network (CNN), described in depth in this article. It uses multiple convolutional layers to create a model which can identify images better than a basic model, like the one we are currently using. Many architectures are available, as mentioned at the end of the article, and which one to use is up to you (many of them are explained in this separate article with links to papers.
This would change the discriminator to be a higher-level image identifying model.
Hopefully the discriminator would become stronger, and would then be able to help create a better generator by being more strict on what it does and doesn't let pass as a real image.
Any initial project coding to do before training
The style transfer model given in the textbook example in our teams chat uses a triple-section loss function, analyzing style, content, and "realness" of the generated image. This would be more complex than the current loss function we are using, and may produce better results.
This would change training, specifically the loss functions, and maybe the discriminator but I don't think it needs to
This would hopefully provide better feedback to the generator, allowing it to learn and grow faster/better than the base model
This task will aim at making the loss function and training code. There are not many additional sections, so this should just be a simple copy and paste of the given sections. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.
From the notebook this task will need everything from the "Train the CycleGAN" and "Visualize our Monet-esque photos" sections
Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base
By the end of the task, ideally we would be able to train our model, but this will, only be possible once all other tasks are completed, so just having the code is alright
This task will aim at making the generator code and anything it may use, including the downsampling and upsampling processes. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.
From the notebook this task will need everything from the "Build the Generator" section
Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base
By the end of the task, all cells should run without error
Getting a LaTeX document which can be edited by anyone to report the findings of their task, contributing to the final report.
We will need a main notebook to work in, and will need the following sections (separated by text titles for each section):
Currently, the TPU available in Colab does not work with our code due to issues using local files. The example notebook uses a KaggleDataset package that seems native to Kaggle notebooks only, so some work will need to be done to make these local files accessible in the same way, and allow the TPU to parse and work with them.
Look into how the model should be built and how we should look to train it
For the style transfer GAN, we will need to base it off of a dataset of images with the same style. This could easily be see through databases of paintings by a certain artist (Van Gogh, Monet), but could also be done with more abstract image databases that have images that all follow the same "style".
Replay buffers alter training of a model to avoid overfitting the model for the given data by keeping the most recently generated images from the past training iterations, and tried to fit the model with both new generated images as well as old ones.
This would change training pretty majorly, implementing essentially a repeater to feed back data that has been generated to then let the model learn off of.
To be honest, I really want to try this to have something to add to the discussion, as seen in this post stating that it does not show any change, and honestly, it might not, but that's report material right there. Ideally though, this would help to avoid overfitting of the model.
A deep convolution GAN is much like a regular GAN, but instead of the current downsampling to a convolutional layer and upsampling we are currently using, it uses multiple convolution layers to learn and alter the images. These layers could be on their own, or in place of the single convolutional layer we currently have. A code notebook can be seen in this Tensorflow Notebook
This would change both the generator and the discriminator of our model. Note that the generator in the sample above only takes noise as input, so that would also need to be edited to take images as input
This would hopefully give us better results, but it might be at the cost of training time.
Once a style has been chosen, we will need an appropriate dataset to train off of for that given style. This dataset should include all images matching the style chosen (so for example, if we chose Van Gogh as our style, we would look for a database of all of Van Gogh's paintings)
This task will aim at making the discriminator and model code. There are not many additional sections, so this should just be a simple copy and paste of the given sections. For an explanation on how this will work and run, please look at the "paint" textbook reading in the Teams chat, or at the main notebook we will be referencing for this project.
From the notebook this task will need everything from the "Build the discriminator" and "Build the CycleGAN model" sections
Nothing, really. We will be modifying this in the future but for now we will just be using this code as our base
By the end of the task, all cells should run without error (IF the generator and image pulling code is already done, otherwise it will throw an error)
Initial project issues for preparing for coding
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.