Giter Club home page Giter Club logo

dgai's Introduction

Demystify Generative AI

“Technology advanced enough is indistinguishable from magic.”

--Arthur C. Clarke (author of 2001: A Space Odyssey)

text, music, image, figure, and pattern generation in PyTorch

A 17-chapter series to create images, text, music, figures, and patterns in PyTorch. The series show how to:

  • Create a ChatGPT-style large language model from scratch to generate text that can pass as human-written
  • Generate images that are indistinguishable from real photos
  • Compose music that anyone would think it’s real
  • Create patterns such as a sequence of odd numbers, multiples of five, ...
  • Generate data that mimic certain shapes: sine curve, cosine shape, hyperbola graph
  • Control the latent space to generate images with certain attributes: men with glasses, women with glasses, transitioning gradually from men with glasses to men without glasses, or from women without glases to women with glasses...
  • Style transfer: convert a horse image to a zebra...

Chapter 1: Introduction to PyTorch

Chapter 2: Deep Learning with PyTorch

Chapter 3: Generative Adversarial Networks (GANs)

Most of the generative models in this book belong to a framework called Generative Adversarial Networks (GANs). This chapter introduces you to the basic idea behind GANs and you'll learn to use the framework to generate data samples that form an inverted-U shape. At the end of this chapter, you'll be able to generate data to mimic any shape: sine, cosine, quadratic, and so on. invertedU

Chapter 4: Pattern Generation with GANs

You'll learn how to use GAN to generate a sequence of numbers with certain patterns. We'll try to generate multiples of five. But you can change the pattern to multiples of two, three, seven, or any number really. This is the output from a trained GAN:

tensor([25, 0, 30, 40, 25, 35, 10, 30, 10, 0], device='cuda:0')

All numbers are multiples of five!

Chapter 5: Image Generation with GANS

Generate image without using convolutional layers:

imageGAN

Chapter 6: High Resolution Image Generation with Deep Convolutional GANs

Use deep convolutional GAN to generate color images:

anime

and control attributes: here you can transition from red-hair to black-hair:

attribute

Chapter 7: Conditional GAN and Wasserstein GAN

Use Wasserstein distance to stabilize training, plus add label to generate certain types of images. E.g., faces without glasses over the course of training: https://gattonweb.uky.edu/faculty/lium/ml/noglasses.gif"

Chapter 8: CycleGAN

Convert horses to zebras:

Fz9kR4BakAEtZEU

Chapter 9: Introduction to Variational Autoencoders

Chapter 10: Attribute-Control in Variational Autoencoders

Train a variational autoencoder (VAE) to generate color images of human faces. Control encodings to generate images with certain attributes: e.g., images that gradually transition from images with glasses to images without glasses. Take the encodings of men with glasses, minus encodings of men without glasses, and add in the encodings of women without glasses, you'll generate images of women with glasses. The whole experience seems like straight out of science fiction, hence the opening quote by the science fiction writer Arthur Clarke: “Technology advanced enough is indistinguishable from magic.”

To give you an idea what the chapter will accomplish, here is the transition from women with glasses to women without glasses: Transition from women without glasses to men without glasses Two examples of encoding arithmetic:

Chapter 11: Text Generation with Character-Level LSTM

Chapter 12: Text Generation with Word-Level LSTM

Chapter 13: A Line-by-Line Implementation of Attention and Transformer

Chapter 14: Create A GPT from Scratch

Below is the text generated by the model with prompt "The city of Lexington in the state of Kentucky":

The city of Lexington in the state of Kentucky, is also offering a $300 award to "Owner" PANZER-KATZ FOR BEST LENGTH OF STREET CARS, or just "For the Best Street Car" in any of their three categories:

4WD (3.5 miles or less)

6WD (3.5 miles or more)

FWD (3.5 miles or more)

And this is for the BEST street car in the 4WD category:

The "Neato" (pronounced "Nice")

What is it with those 4WD cars and their "Neato" names? This is probably one of the most well-know 4WD names in the history of 4WD cars. It is so well known that there are a multitude of books dedicated to the design and specifications of "Nice" 4WD cars, such as this one from Michael B. Smith, which is a good read.

But as of right

Chapter 15: Train a ChatGPT style Transformer

Chapter 16: MuseGAN

Train a generative adversarial network (GAN) to produce music. here is a sample of the generated music: https://gattonweb.uky.edu/faculty/lium/ml/MuseGAN_song.mp3

Chapter 17: Music Transformer

Train a ChatGPT-style transformer to generate music. here is a sample of the generated music: https://gattonweb.uky.edu/faculty/lium/ml/musicTrans.mp3

dgai's People

Contributors

markhliu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.