Giter Club home page Giter Club logo

opendeep's Introduction

Documentation Status

OpenDeep

OpenDeep: a fully modular & extensible deep learning framework in Python

Developer hub: http://www.opendeep.org/

OpenDeep is a deep learning framework for Python built from the ground up in Theano with a focus on flexibility and ease of use for both industry data scientists and cutting-edge researchers. OpenDeep is a modular and easily extensible framework for constructing any neural network architecture to solve your problem.

Use OpenDeep to:

  • Quickly prototype complex networks through a focus on complete modularity and containers similar to Torch.
  • Configure and train existing state-of-the-art models.
  • Write your own models from scratch in Theano and plug into OpenDeep for easy training and dataset integration.
  • Use visualization and debugging tools to see exactly what is happening with your neural net architecture.
  • Plug into your existing Numpy/Scipy/Pandas/Scikit-learn pipeline.
  • Run on the CPU or GPU.

This library is currently undergoing rapid development and is in its alpha stages.

Quick example usage

Train and evaluate a Multilayer Perceptron (MLP - your generic feedforward neural network for classification) on the MNIST handwritten digit dataset:

from opendeep.models import Prototype, Dense, Softmax
from opendeep.models.utils import Noise
from opendeep.optimization.loss import Neg_LL
from opendeep.optimization import AdaDelta
from opendeep.data import MNIST
from theano.tensor import matrix, lvector

print "Getting data..."
data = MNIST()

print "Creating model..."
in_shape = (None, 28*28)
in_var = matrix('xs')
mlp = Prototype()
mlp.add(Dense(inputs=(in_shape, in_var), outputs=512, activation='relu'))
mlp.add(Noise, noise='dropout', noise_level=0.5)
mlp.add(Dense, outputs=512, activation='relu')
mlp.add(Noise, noise='dropout', noise_level=0.5)
mlp.add(Softmax, outputs=10, out_as_probs=False)

print "Training..."
target_var = lvector('ys')
loss = Neg_LL(inputs=mlp.models[-1].p_y_given_x, targets=target_var, one_hot=False)

optimizer = AdaDelta(model=mlp, loss=loss, dataset=data, epochs=10)
optimizer.train()

print "Predicting..."
predictions = mlp.run(data.test_inputs)

print "Accuracy: ", float(sum(predictions==data.test_targets)) / len(data.test_targets)

Congrats, you just:

  • set up a dataset (MNIST)
  • instantiated a Prototype container model
  • added fully-connected (dense) layers and dropout noise to create an MLP
  • trained it with an AdaDelta optimizer
  • and predicted some outputs given inputs!

Working example!

Installation

Because OpenDeep is still in alpha, you have to install via setup.py. Also, please make sure you have these dependencies installed first.

Dependencies

  • Theano: Theano and its dependencies are required to use OpenDeep. You need to install the bleeding-edge version directly from their GitHub, which has installation instructions here.
  • Six: Python 2/3 compatibility library.
  • Pillow (PIL) (optional): image manipulation functionality.
  • PyYAML (optional): used for YAML parsing of config files.
  • Bokeh (optional): if you want live charting/plotting of values during training or testing.
  • NLTK (optional): if you want nlp functions like word tokenization.

All of these Python dependencies (not the system-specific ones like CUDA or HDF5), can be installed with pip install -r requirements.txt inside the root OpenDeep folder.

Install from source

  1. Navigate to your desired installation directory and download the github repository:

    git clone https://github.com/vitruvianscience/opendeep.git
  2. Navigate to the top-level folder (should be named OpenDeep and contain the file setup.py) and run setup.py with develop mode:

    cd opendeep
    python setup.py develop

Using python setup.py develop instead of the normal python setup.py install allows you to update the repository files by pulling from git and have the whole package update! No need to reinstall when you get the latest files.

That's it! Now you should be able to import opendeep into python modules.

More Information

Source code: https://github.com/vitruvianscience/opendeep

Documentation and tutorials: http://www.opendeep.org/

User group: opendeep-users

Developer group: opendeep-dev

Twitter: @opendeep

We would love all help to make this the best library possible! Feel free to fork the repository and join the Google groups!

Why OpenDeep?

  • Modularity. A lot of recent deep learning progress has come from combining multiple models. Existing libraries are either too confusing or not easily extensible enough to perform novel research and also quickly set up existing algorithms at scale. This need for transparency and modularity is the main motivating factor for creating the OpenDeep library, where we hope novel research and industry use can both be easily implemented.
  • Ease of use. Many libraries require a lot of familiarity with deep learning or their specific package structures. OpenDeep's goal is to be the best-documented deep learning library and have smart enough default code that someone without a background can start training models, while experienced practitioners can easily create and customize their own algorithms.
  • State of the art. A side effect of modularity and ease of use, OpenDeep aims to maintain state-of-the-art performance as new algorithms and papers get published. As a research library, citing and accrediting those authors and code used is very important to the library.

opendeep's People

Contributors

adammenges avatar erogol avatar gregretkowski avatar jimmycallin avatar mbeissinger avatar vitruvianscience avatar warvito avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opendeep's Issues

Training assumes float32, but uses float64 if that's the default configuration

When installing Theano on a 64 bit computer, it uses float64 as default. Running through the hello world example, this causes an error where the training assumes the input is float32.

from opendeep.models.container import Prototype
from opendeep.models.single_layer.basic import BasicLayer, SoftmaxLayer
from opendeep.optimization.adadelta import AdaDelta
from opendeep.data.standard_datasets.image.mnist import MNIST

mlp = Prototype()
mlp.add(BasicLayer(input_size=28*28, output_size=512, activation='rectifier', noise='dropout'))
mlp.add(BasicLayer(output_size=512, activation='rectifier', noise='dropout'))
mlp.add(SoftmaxLayer(output_size=10))

trainer = AdaDelta(model=mlp, dataset=MNIST())
trainer.train()

This raises:

TypeError: Cannot convert Type TensorType(float64, matrix) (of Variable Subtensor{int32:int32:}.0) into Type TensorType(float32, matrix). You can try to manually convert Subtensor{int32:int32:}.0 into a TensorType(float32, matrix).

Error by MemoryDataset

If I init MemoryDataset it raise

OD_DATA = datasets.MemoryDataset(train_np, valid_X=val_np)

ValueError Traceback (most recent call last)
in ()
----> 1 OD_DATA = datasets.MemoryDataset(train_np, valid_X=val_np)

/home/retina18/Downloads/opendeep/opendeep/data/dataset.pyc in init(self, train_X, train_Y, valid_X, valid_Y, test_X, test_Y)
283 self.train_Y = sharedX(numpy.array(train_Y))
284
--> 285 if valid_X:
286 valid_X = numpy.array(valid_X)
287 self._valid_shape = valid_X.shape

Error by MemoryDataset 2

I try to use my custom dataset with the given AutoEncoder example but when it starts it raises following error

aining)
    225                      str(type(self.model)), func_i+1, len(self.train_functions), self.n_epoch, str(continue_training))
    226 
--> 227             log.debug("Train dataset size is: %s", self.dataset.getDataShape(datasets.TRAIN))
    228             if self.dataset.hasSubset(datasets.VALID):
    229                 log.debug("Valid dataset size is: %s", self.dataset.getDataShape(datasets.VALID))

/home/retina18/Downloads/opendeep/opendeep/data/dataset.py in getDataShape(self, subset)
    111         if subset is TRAIN:
    112             log.error("No training shape implemented")
--> 113             raise NotImplementedError("No training shape implemented")
    114         elif subset is VALID:
    115             log.error("No valid shape implemented")

NotImplementedError: No training shape implemented

Import Errors

Hi!
I just installed OpenDeep for a workshop at the OSDC, but I am getting some importing errors:

ImportError: cannot import name DenoisingAutoencoder
ImportError: cannot import name SoftmaxLayer

Do you have any suggestions as to what I am doing wrong?

Thanks,
Rick

Python3 build

Howdy! OpenDeep looks great and I'm keen to get it working on Kaggle Scripts. I had a go at building it in our Python Docker container, but hit a string/bytes error in the setup script:

File "setup.py", line 49, in <module>
  long_description=read('README.rst'),
File "setup.py", line 28, in read
  return sep.join(buf)
TypeError: sequence item 0: expected str instance, bytes found

Tried a few hacky workarounds, but couldn't get it to land. Is this a Python3 issue? Any ideas on how to fix it?

multivariate time series example

Hi!
Great library!!
Was wondering if there were plans for a multivariate time series example.
For a multi sequence vectors to predict a single sequence output.
ie Time series prediction with multiple sequences input mapped to a single output.
Many thanks,
Best,
Andrew

ImportError

I tried your sample code provided under the title "Tutorial: Your First Model". However, it seems that there is no such file called "cost" in the directory "utils". As a result, I fail to import binary_crossentropy from opendeep.utils.cost.

Spurious warning about default file name

On line 729 of opendeep/models/model.py the default file name is set to "config":

def save_args(self, args_file="config"):

A call to file_ops.get_extension_type() in the method logs a warning "Didn't recognize file extension..." which is spurious because save_args() later appends ".pkl" to the args_file in this case. Suggested fix is to make the default file name "config.pkl" in the first place. This will have exactly the same default behaviour without the warning.

Zip objects not indexable in Python3

On line 111 of opendeep/models/container/prototype.py, current inputs is calculated as follows:

current_inputs = zip(previous_out_sizes, previous_outs)

In Python2, zip() returns a list, but in Python3 it returns an iterable zip object. This causes a failure later, when calling code tries to index the result. A simple fix is to wrap the expression in list().

Bokeh API change: cursession gone

On line 18 of build/lib/opendeep/monitor/plot.py, symbol cursession is imported from bokeh.plotting:

from bokeh.plotting import (curdoc, cursession, figure, output_server, push, show)

The symbol cursession no longer exists in the latest bokeh version (0.11) available via pip. Work around: install version 0.10.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.