Giter Club home page Giter Club logo

hairy_gan's Introduction

hairy_gan's People

Contributors

ttury avatar

Stargazers

 avatar

Watchers

James Cloos avatar itsgarland avatar paper2code - bot avatar

hairy_gan's Issues

custom hair colour

We will try different hair colours

Our first approach is to use computer vision to color the hair directly via some noise.

Our second approach is to use a hair color recognizer and create a GAN to take a hair image and convert to the desired color. We verify using the recognizer. Loss will simply just be binary_crossentropy (check colors are same).

cgan for hair color only

the previous post i made was unsuccessful mainly because we are limited to two domains in cycle gan (same with augmented cycle gan). we focus instead on using a single conditional gan.

style_transfer

Implementation is not working...look for second implementation for reference.

black hair => blonde hair GAN

  1. separate images with black hair (A) and images with blond hair (B) from training set
  • wrap dataset info into class object
  • split to get training set
  • filter for mustaches. take complement for no mustaches
  1. recreate cycle gan from scratch
  2. train cycle gan on datasets A and B
  • consider data augmentation for the datasets

Prototype Design

So while we wait for the models to train, we will now focus on creating a web app/some medium to interact with model(s).

Phase 2: How do we use HairyGAN model?

We want to run this model so video output (whether it is a video recording, or real time) gives the translated form. Since we only trained on faces from the frontal perspective, we do not have the flexibility to check the sides or back. This limits the robustness of our model.

attributes ambiguity

Suppose we have an image of a person with blond hair. Then if our requested attributes has a -1 for blond hair, what does that mean for the translated image?

We have been interpreting -1 to say we should get rid of the blond hair, but this is ambiguous as we don't know what hair the translated product should have. Instead, the requested attributes should keep the blond hair (therefore set to 1) UNLESS there is another hair color that was requested.

For example:

  1. Bald image with requested attributes that has no hair change, keep Bald=> Bald image

  2. Bald image with requested attributes that has hair change => Hair Color image

no glasses feature

I really cannot tell what a person looks like without glasses. Make this a feature!

Augmented Cycle Gan

Performs many to many mappings between TWO domains. For example, edges of shoe image gives multiple different colored shoes instead of 1 (i.e. CycleGan).

This will not work for having multiple domains to consider!

restoring resolution

Running the decoding leads to low resolution (128x128) predictions. We need to restore the original resolution.

ML issues

  1. tried to test my model on smaller set of images and on only 1 feature: baldness.

after training the model, it correctly predicts the value of Bald for original images, but always predicts Bald for translated images. This is our goal as we set the new attributes to always be Bald. However, the images do not look bald! This might be due to having too small of a dataset, therefore the model determines Bald by a different feature from what we understand as Bald.

model_comparison

observe in the bald row that we actually do not have a lot of Baldness so this might in fact be what bald is...

lesson verify if output is expected based on the original model

cgan for hair colour

  1. modify existing gan to use conditional generators and discriminators.
  2. ideally we want binary vector input representing combination of inputs. we extract these labels from the property columns and only getting whether or not they are 0 or 1 (in our case, it's -1 or 1).
  3. training will be using these binary vector inputs

problem: given 40 features, we have 2^40 (approx. 1 trillion) combinations... this is too many types to train on.

Instead, we can train on one feature at a time and encode its feature label as the order it is indexed as. For example, if blond hair was the 27th property, then we use the label 27.

For training, since we have multiple features for each image, we can train it on one feature at a time. For example, suppose we are only interested in turning any person's hair color to blond. then we can simply ignore the other features and just look at the blond hair column for all the inputs. Once the generator becomes good at generating blond hair, we can work on a different feature afterwards.

problem 2: what happens if we have an image that doens't have blond hair? Then we have a -1 property value. Then the generator would generate what? Whereas with digits, we can get away with knowing every input value is something we can generate, non-blond hair isn't. At the very least, we should define it more clearly to be, for example, black hair as default.

Since hair color is mutually exclusive, every image in the dataset is true for one of those values. Then we should find the corresponding index for the hair color it belongs to so the generator can actually train on a real hair color (instead of potentially returning gibberish non-blond hair).

problem 3: with digits, our cgan simply took a latent space and label, fed it to a generator, and outputted (ideally) a synthetic representation of the label. Since we are working with images, however, what do we expect the generator to output exactly? If we feed it a label "blond hair" then i expect the model to output an image based off the input with blond hair! So what is the discriminator outputting? Recall the discriminator determines if an output is real or fake (but not interested in the actual value itself). With digits, we pass an image and corresponding label to the discriminator and return 0 (fake) or 1 (real). The discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or vice versa.

Similarly, with face images, we pass the transformed face image (or original) and corresponding label (property index) to the discriminator and return 0 (fake) or 1 (real). And similarly, the discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or the image is transformed (say a black haired person is not a blond haired person) and the ddiscriminator thinks it is real.

So, when we pass real images, we expect the discriminator to output 1. when we pass fake images, we expect the discriminator to output 0. the generator learns by not creating well-represented images.

Summary of training:

input: face image + label (property index)
output: 0 or 1 for fake or real

one last problem... whereas with digits with have mutual exclusiveness. Now we don't. So the discriminator might initially learn to believe these classes are mutually exclusive (? this might not be true but im just considering it).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.