GAN for trying different hairstyles
Replicating AttGAN arch: https://arxiv.org/pdf/1711.10678.pdf
celeba dataset attributes:
GAN for trying different hairstyles
Replicating AttGAN arch: https://arxiv.org/pdf/1711.10678.pdf
celeba dataset attributes:
We will try different hair colours
Our first approach is to use computer vision to color the hair directly via some noise.
Our second approach is to use a hair color recognizer and create a GAN to take a hair image and convert to the desired color. We verify using the recognizer. Loss will simply just be binary_crossentropy (check colors are same).
the previous post i made was unsuccessful mainly because we are limited to two domains in cycle gan (same with augmented cycle gan). we focus instead on using a single conditional gan.
Implementation is not working...look for second implementation for reference.
So while we wait for the models to train, we will now focus on creating a web app/some medium to interact with model(s).
We want to run this model so video output (whether it is a video recording, or real time) gives the translated form. Since we only trained on faces from the frontal perspective, we do not have the flexibility to check the sides or back. This limits the robustness of our model.
Suppose we have an image of a person with blond hair. Then if our requested attributes has a -1 for blond hair, what does that mean for the translated image?
We have been interpreting -1 to say we should get rid of the blond hair, but this is ambiguous as we don't know what hair the translated product should have. Instead, the requested attributes should keep the blond hair (therefore set to 1) UNLESS there is another hair color that was requested.
For example:
Bald image with requested attributes that has no hair change, keep Bald=> Bald image
Bald image with requested attributes that has hair change => Hair Color image
revelation that classifier should not be training based on generated attributes as the generator could be mistaken too
I really cannot tell what a person looks like without glasses. Make this a feature!
Performs many to many mappings between TWO domains. For example, edges of shoe image gives multiple different colored shoes instead of 1 (i.e. CycleGan).
This will not work for having multiple domains to consider!
Running the decoding leads to low resolution (128x128) predictions. We need to restore the original resolution.
after training the model, it correctly predicts the value of Bald for original images, but always predicts Bald for translated images. This is our goal as we set the new attributes to always be Bald. However, the images do not look bald! This might be due to having too small of a dataset, therefore the model determines Bald by a different feature from what we understand as Bald.
observe in the bald row that we actually do not have a lot of Baldness so this might in fact be what bald is...
lesson verify if output is expected based on the original model
problem: given 40 features, we have 2^40 (approx. 1 trillion) combinations... this is too many types to train on.
Instead, we can train on one feature at a time and encode its feature label as the order it is indexed as. For example, if blond hair was the 27th property, then we use the label 27.
For training, since we have multiple features for each image, we can train it on one feature at a time. For example, suppose we are only interested in turning any person's hair color to blond. then we can simply ignore the other features and just look at the blond hair column for all the inputs. Once the generator becomes good at generating blond hair, we can work on a different feature afterwards.
problem 2: what happens if we have an image that doens't have blond hair? Then we have a -1 property value. Then the generator would generate what? Whereas with digits, we can get away with knowing every input value is something we can generate, non-blond hair isn't. At the very least, we should define it more clearly to be, for example, black hair as default.
Since hair color is mutually exclusive, every image in the dataset is true for one of those values. Then we should find the corresponding index for the hair color it belongs to so the generator can actually train on a real hair color (instead of potentially returning gibberish non-blond hair).
problem 3: with digits, our cgan simply took a latent space and label, fed it to a generator, and outputted (ideally) a synthetic representation of the label. Since we are working with images, however, what do we expect the generator to output exactly? If we feed it a label "blond hair" then i expect the model to output an image based off the input with blond hair! So what is the discriminator outputting? Recall the discriminator determines if an output is real or fake (but not interested in the actual value itself). With digits, we pass an image and corresponding label to the discriminator and return 0 (fake) or 1 (real). The discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or vice versa.
Similarly, with face images, we pass the transformed face image (or original) and corresponding label (property index) to the discriminator and return 0 (fake) or 1 (real). And similarly, the discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or the image is transformed (say a black haired person is not a blond haired person) and the ddiscriminator thinks it is real.
So, when we pass real images, we expect the discriminator to output 1. when we pass fake images, we expect the discriminator to output 0. the generator learns by not creating well-represented images.
Summary of training:
input: face image + label (property index)
output: 0 or 1 for fake or real
one last problem... whereas with digits with have mutual exclusiveness. Now we don't. So the discriminator might initially learn to believe these classes are mutually exclusive (? this might not be true but im just considering it).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.