Giter Club home page Giter Club logo

text-to-image-generator-gan's Introduction

Text to Image Synthesizer using Generative Adversarial Networks

The purpose of this project is to train a generative adversarial network (GAN) to generate images from textual description of the image. For this particular project, I have used flower images from the Oxford 102 Flower Dataset.

Data:

The data of the image description was obtained from here. The image caption data link was obtained from the following github.

The flowers dataset has 102 categories of flower images. Each category has 40-258 images. The total number of flower images-description pairs used in this project is 8100.

The image descriptions were processed into character level embeddings using the 300 D GloVe embeddings. You can obtain them from here. I used the embeddings created on Wikipedia 2015 and Gigaword-5 data.

The images of the flowers are as below: Flowers

The captions are as below: Captions

Architecture:

This is an experimental implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. However, we have not used Skip-Thoughts vectors, instead, we tried the implementation using the GloVe embeddings.

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper.

Note: We have tried to implement the GAN architecture as best as we could. It is not guaranteed to be the exact same as the one in the paper.

Implementation:

The iPython notebook has comments indicating what each cell does. You can then execute the cells in text_to_image_synthesizer_glove.ipynb to see how everything works.

The notebook has 5 parts:

  • Data Pre-processing:
    • Converting the images into numpy arrays based on the pre-set pixel size and storing them.
    • Converting the image description into embeddings and storing them.
    • Sotring the captions in a csv file.
  • Loading and Combining:
    • Loading all the images' numpy arrays and appending it to form the image data.
    • Loading the image description embedding numpy array.
  • Data modeling:
    • Creating the discrimintor and generator networks.
    • Functions for calculating the discriminator and generator loss.
    • Setting the optimizer and learning rate.
  • Training:
    • Function for the train step, that creates the images, calculates the loss and adjusts the gradients.
    • Function for the train, it fetches the batch data and passes it to train_step and collects all the metrics.
  • Results:
    • Function for testing the output from the generator given an input noise and a caption

Result:

Result1

Note: Addtional references in the iPython notebook.

text-to-image-generator-gan's People

Contributors

utsav-195 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

text-to-image-generator-gan's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.