Giter Club home page Giter Club logo

tfsnippet's Introduction

TFSnippet

https://travis-ci.org/haowen-xu/tfsnippet.svg?branch=master https://coveralls.io/repos/github/haowen-xu/tfsnippet/badge.svg?branch=master https://readthedocs.org/projects/tfsnippet/badge/?version=latest

TFSnippet is a set of utilities for writing and testing TensorFlow models.

The design philosophy of TFSnippet is non-interfering. It aims to provide a set of useful utilities, possible to be used along with any other TensorFlow libraries and frameworks.

Dependencies

TensorFlow >= 1.5

Installation

pip install git+https://github.com/thu-ml/zhusuan.git
pip install git+https://github.com/haowen-xu/tfsnippet.git

Documentation

Quick Tutorial

Distributions

If you use tfsnippet.distributions to obtain random samples, you shall get enhanced tensor objects, from which you may compute the log-likelihood by simply calling log_prob().

from tfsnippet.distributions import Normal

normal = Normal(0., 1.)
# The type of `samples` is :class:`tfsnippet.stochastic.StochasticTensor`.
samples = normal.sample(n_samples=100)
# You may obtain the log-likelhood of `samples` under `normal` by:
log_prob = samples.log_prob()
# You may also obtain the distribution instance back from the samples,
# such that you may fire-and-forget the distribution instance!
distribution = samples.distribution

The distributions from ZhuSuan can be casted into a tfsnippet.distributions.Distribution, in case we haven't provided a wrapper for a certain ZhuSuan distribution:

from tfsnippet.distributions import as_distribution

uniform = as_distribution(zhusuan.distributions.Uniform())
# The type of `samples` is :class:`tfsnippet.stochastic.StochasticTensor`.
samples = uniform.sample(n_samples=100)

Data Flows

It is a common practice to iterate through a dataset by mini-batches. The tfsnippet.dataflow provides a unified interface for assembling the mini-batch iterators.

from tfsnippet.dataflow import DataFlow

# Obtain a shuffled, two-array data flow, with batch-size 64.
# Any batch with samples fewer than 64 would be discarded.
flow = DataFlow.arrays(
    [x, y], batch_size=64, shuffle=True, skip_incomplete=True)
for batch_x, batch_y in flow:
    ...  # Do something with batch_x and batch_y

# You may use a threaded data flow to prefetch the mini-batches
# in a background thread.  The threaded flow is a context object,
# where exiting the context would destroy the background thread.
with flow.threaded(prefetch=5) as threaded_flow:
    for batch_x, batch_y in threaded_flow:
        ...  # Do something with batch_x and batch_y

# If you use `MLToolkit <https://github.com/haowen-xu/mltoolkit>`_,
# you can even load data from a MongoDB via data flow.  Suppose you
# have stored all images from ImageNet into a GridFS (of MongoDB),
# along with the labels stored as ``metadata.y``.
# You may iterate through the ImageNet in batches by:
from mltoolkit.datafs import MongoFS

fs = MongoFS('mongodb://localhost', 'imagenet', 'train')
with fs.as_flow(batch_size=64, with_names=False, meta_keys=['y'],
                shuffle=True, skip_incomplete=True) as flow:
    for batch_x, batch_y in flow:
        ...  # Do something with batch_x and batch_y.  batch_x is the
             # raw content of images you stored into the GridFS.

Training

After you've build the model and obtained the training operation, you may quickly run a training-loop by using utilities from tfsnippet.scaffold and tfsnippet.trainer.

from tfsnippet.dataflow import DataFlow
from tfsnippet.scaffold import TrainLoop
from tfsnippet.trainer import Trainer, Evaluator, AnnealingDynamicValue

input_x = ...  # the input x placeholder
input_y = ...  # the input y placeholder
loss = ...  # the training loss
params = tf.trainable_variables()  # the trainable parameters

# We shall adopt learning-rate annealing, the initial learning rate is
# 0.001, and we would anneal it by a factor of 0.99995 after every step.
learning_rate = tf.placeholder(shape=(), dtype=tf.float32)
learning_rate_var = AnnealingDynamicValue(0.001, 0.99995)

# Build the training operation by AdamOptimizer
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.minimize(loss, var_list=params)

# Build the training data-flow
train_flow = DataFlow.arrays(
    [train_x, train_y], batch_size=64, shuffle=True, skip_incomplete=True)
# Build the validation data-flow
valid_flow = DataFlow.arrays([valid_x, valid_y], batch_size=256)

with TrainLoop(params, max_epoch=max_epoch, early_stopping=True) as loop:
    trainer = Trainer(loop, train_op, [input_x, input_y], train_flow,
                      metrics={'loss': loss})
    # Anneal the learning-rate after every step by 0.99995.
    trainer.anneal_after_steps(learning_rate_var, freq=1)
    # Do validation and apply early-stopping after every epoch.
    trainer.evaluate_after_epochs(
        Evaluator(loop, loss, [input_x, input_y], valid_flow),
        freq=1
    )
    # You may log the learning-rate after every epoch by adding a callback
    # hook.  Surely you may also add any other callbacks.
    trainer.after_epochs.add_hook(
        lambda: trainer.loop.collect_metrics(lr=learning_rate_var),
        freq=1
    )
    # Print training metrics after every epoch.
    trainer.log_after_epochs(freq=1)
    # Run all the training epochs and steps.
    trainer.run(feed_dict={learning_rate: learning_rate_var})

Pre-trained Models

The tfsnippet.applications package provides useful utilities to load and use pre-trained models (among which most are third-party models).

from tfsnippet.applications import InceptionV3

# Model from https://www.tensorflow.org/tutorials/image_recognition
inception_v3 = InceptionV3()
image_data = imageio.imread('path-to-image.jpg')
class_proba = inception_v3.predict_proba([image_data])[0]

Math Operations

The tfsnippet.nn package provides numerical stable implementations for lots of advanced neural network math operations. Also, it supports both NumPy and TensorFlow. You may obtain the math operation for a particular backend by passing tfsnippet.nn.npyos (for NumPy) or tfsnippet.nn.tfops (for TensorFlow) as the first argument ops of every math operation function.

from tfsnippet.nn import npyops, tfops
from tfsnippet.nn import log_sum_exp, log_softmax

# Compute :math:`\log \sum_{k=1}^K \exp(x_k)` by TensorFlow
log_sum_exp(tfops, x, axis=-1)
# Compute :math:`\log \frac{\exp(x_k)}{\sum_i \exp(x_i)}` by NumPy
log_softmax(npyops, logits)

tfsnippet's People

Contributors

korepwx avatar haowen-xu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.