Giter Club home page Giter Club logo

genome's Introduction

genome

Read the writeup on my website!

Binary neural networks have a low memory footprint and run crazy fast. Let's use that to speed up reinforcement learning.

For example, this network is 600 bytes and performs 500,000 evaluations/second on my laptop CPU:

Binary Cartpole

Theory

The goal of this project is to train binary neural networks directly in a reinforcement learning environment using natural evolution strategies. Binary networks are a great fit for RL because:

Binary neural networks

The networks implemented here are a modification of XNOR-Nets. Each layer's weights, inputs, and outputs are constrained to being vectors of +/-1 values, which are encoded as binary, and use the sign function as their nonlinearity. Layers compute the function f(x; W, b) = \text{sign}(W^Tx + b).

In exchange, these networks have an extremely fast forward pass because the dot product of binary-encoded +/-1 binary vectors x and y is n_bits - popcount(x XOR y), which can be computed in just a few clock cycles. By baking the subtraction and the bias into the comparison for the sign function, we can speed up inference even more. Each activation / weight vector is stored as a uint64_t, so memory access is very fast and usage is extremely low.

Natural evolution strategies

Evolution strategies work by maintaining a search distribution over neural networks. At each step, we sample a new population of networks from the distribution, evaluate each, then update the search distribution towards the highest-performing samples. Natural evolution strategies do this by following the natural gradient to update the search distribution in the direction of highest expected reward.

To train binary networks with NES, I use a separable Bernoulli distribution, parameterized by a vector of logits. Each generation's update computes the closed-form natural gradient with respect to the bit probabilities, and then backpropagates through the sigmoid function to update the logits.

Building the project

This project has a fairly large number of heterogenous dependencies, making building somewhat of a pain. There is a Poetry build hook which should build the whole project by just running poetry install. The full build chain is:

  • Create device object files for the GPU with nvcc.
  • Transpile pyx files into C++ with Cython (which wrap the CUDA code in a Python-accessible way).
  • Make shared object libraries that can be imported by Python out of the newly-created cpp files.
  • Install any Python dependencies and perform Python module installation tasks.

Usage

Executing poetry run python genome/demo.py will run a demo that trains a small neural network to balance a pole. If you're on a graphical system, it should render episodes periodically as the model learns. It also dumps logs in outputs, which you can inspect with tensorboard --logdir=outputs to watch training metrics evolve.

Every run saves its own logs, so they require unique names. If you want to run the same command twice, you can delete the old log files under that name, or pick a new name.

genome's People

Contributors

maxwells-daemons avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.