Giter Club home page Giter Club logo

ctgan-tf's Introduction

TensorFlow CTGAN

TensorFlow 2.1 implementation of Conditional Tabular GAN.

PyPI Shield Build Status Coverage Status

Tensorflow 2.1 implementation of a Conditional Tabular Generative Adversarial Network. CTGAN is a GAN-based data synthesizer that can "generate synthetic tabular data with high fidelity".

This model was originally designed by the Data to AI Lab at MIT team, and it was published in their NeurIPS paper Modeling Tabular data using Conditional GAN.

For more information regarding this work, and to access the original PyTorch implementation provided by the authors, please refer to their GitHub repository and their documentation:

Install

Requirements

As of this moment, CTGAN has been solely tested tested on Python 3.7, and TensorFlow 2.2.

  • tensorflow (<2.3,>=2.1.0)
  • tensorflow-probability (<0.11.0,>=0.9.0)
  • scikit-learn (<0.23,>=0.21)
  • numpy (<2,>=1.17.4)
  • pandas (<1.0.2,>=1.0)
  • tqdm (<4.44,>=4.43)

Install

You can either install ctgan-tf through the PyPI package:

pip3 install ctgan-tf

Or by cloning this repository and copying the ctgan folder to your project folder, or simply run:

make install

Data Format

CTGAN expects the input data to be a table given as either a numpy.ndarray or a pandas.DataFrame object with two types of columns:

  • Continuous Columns: Columns that contain numerical values and which can take any value.
  • Discrete columns: Columns that only contain a finite number of possible values, whether these are string values or not.

Quickstart

Before being able to use CTGAN you will need to prepare your data as specified above.

For this example, we will be loading some data using the ctgan.load_demo function.

from ctgan.utils import load_demo

data, discrete_columns = load_demo()

Even though the provided example already contains a list of discrete values, aside from the data itself, you will need to create a list with the names of the discrete variables:

discrete_columns = [
    'workclass',
    'education',
    'marital-status',
    'occupation',
    'relationship',
    'race',
    'sex',
    'native-country',
    'income'
]

Once you have the data ready, you need to import and create an instance of the CTGANSynthesizer class and fit it passing your data and the list of discrete columns.

from ctgan.synthesizer import CTGANSynthesizer

ctgan = CTGANSynthesizer()
ctgan.train(data, discrete_columns)

Once the process has finished, all you need to do is call the sample method of your CTGANSynthesizer instance indicating the number of rows that you want to generate.

samples = ctgan.sample(1000)

The output will be a table with the exact same format as the input and filled with the synthetic data generated by the model.

For a more in-depth guide and API specification, check our documentation here.

ctgan-tf's People

Contributors

pbmartins avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.