Giter Club home page Giter Club logo

boltzmann.jl's Introduction

Boltzmann.jl

Build Status codecov.io

Restricted Boltzmann machines and deep belief networks in Julia

Installation

Pkg.add("Boltzmann")

installing latest development version:

Pkg.clone("https://github.com/dfdx/Boltzmann.jl")

RBM Basic Usage

Train RBM:

using Boltzmann

X = randn(100, 2000)    # 2000 observations (examples) 
                        #  with 100 variables (features) each
X = (X + abs(minimum(X))) / (maximum(X) - minimum(X)) # scale X to [0..1]
rbm = GRBM(100, 50)     # define Gaussian RBM with 100 visible (input) 
                        #  and 50 hidden (output) variables
fit(rbm, X)             # fit model to data 

(for more meaningful dataset see MNIST Example)

After model is fitted, you can extract learned coefficients (a.k.a. weights):

W = coef(rbm)

transform data vectors into new higher-level representation (e.g. for further classification):

Xt = transform(rbm, X)  # vectors of X have length 100, vectors of Xt - length 50

or generate vectors similar to given ones (e.g. for recommendation, see example here)

x = ... 
x_new = generate(rbm, x)

RBMs can handle both - dense and sparse arrays. It cannot, however, handle DataArrays because it's up to application how to treat missing values.

RBM Kinds

This package provides implementation of the 2 most popular kinds of restricted Boltzmann machines:

  • BernoulliRBM: RBM with binary visible and hidden units
  • GRBM: RBM with Gaussian visible and binary hidden units

Bernoulli RBM is classic one and works great for modeling binary (e.g. like/dislike) and nearly binary (e.g. logistic-based) data. Gaussian RBM works better when visible variables approximately follow normal distribution, which is often the case e.g. for image data.

Deep Belief Networks

DBNs are created as a stack of named RBMs. Below is an example of training DBN for MNIST dataset:

using Boltzmann
using MNIST

X, y = traindata()
X = X[:, 1:1000]                     # take only 1000 observations for speed
X = X / (maximum(X) - (minimum(X)))  # normalize to [0..1]

layers = [("vis", GRBM(784, 256)),
          ("hid1", BernoulliRBM(256, 100)),
          ("hid2", BernoulliRBM(100, 100))]
dbn = DBN(layers)
fit(dbn, X)
transform(dbn, X)

Deep Autoencoders

Once built, DBN can be converted into a deep autoencoder. Continuing previous example:

dae = unroll(dbn)

DAEs cannot be trained directly, but can be used to transform input data:

transform(dae, X)

In this case output will have the same dimensionality as input, but with a noise removed.

Integration with Mocha

Mocha.jl is an excellent deep learning framework implementing auto-encoders and a number of fine-tuning algorithms. Boltzmann.jl allows to save pretrained model in a Mocha-compatible file format to be used later on for supervised learning. Below is a snippet of the essential API, while complete code is available in Mocha Export Example:

# pretraining and exporting in Boltzmann.jl
dbn_layers = [("vis", GRBM(100, 50)),
              ("hid1", BernoulliRBM(50, 25)),
              ("hid2", BernoulliRBM(25, 20))]
dbn = DBN(dbn_layers)
fit(dbn, X)
save_params(DBN_PATH, dbn)

# loading in Mocha.jl
backend = CPUBackend()
data = MemoryDataLayer(tops=[:data, :label], batch_size=500, data=Array[X, y])
vis = InnerProductLayer(name="vis", output_dim=50, tops=[:vis], bottoms=[:data])
hid1 = InnerProductLayer(name="hid1", output_dim=25, tops=[:hid1], bottoms=[:vis])
hid2 = InnerProductLayer(name="hid2", output_dim=20, tops=[:hid2], bottoms=[:hid1])
loss = SoftmaxLossLayer(name="loss",bottoms=[:hid2, :label])
net = Net("TEST", backend, [data, vis, hid1, hid2])

h5open(DBN_PATH) do h5
    load_network(h5, net)
end

boltzmann.jl's People

Contributors

catawbasam avatar dfdx avatar eric-tramel avatar femtocleaner[bot] avatar fissoreg avatar jbn avatar jfsantos avatar juliatagbot avatar marylou-gabrie avatar rofinn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

boltzmann.jl's Issues

Hi,

I've been using your Boltzmann.jl for my project. I'd like to first appreciate your amazing work. The question I've come across with is
a) How to find the error rate for each iteration.
b) How can I get the unroll the dbn layers. and
c) How can i change the functions inside rbm ( say, I wish to change tanh function instead of logistic ).

Since I am newbie to Julia I dunno how to alter the source code.
Any help would be appreciated.

Generic Test Harness & Benchmarking

Given the flexible workflow for building custom RBMs it would be helpful to have a generic test harness & benchmarking function for people to test their custom RBMs with.

NOTE: I can work on this.

DBN redesign

Current DBN implementation simply uses a list of (anonymous) RBMs, but it seems like a more common approach for layered models is to give every layer a name (e.g. see SciKit Learn's Pipeline or design of Mocha.jl). It helps in debugging, allows working with named entities (instead of indexes) and enables interoperability with other tools.

I have prepared new implementation that adds names to layers and allows to save DBN to Mocha-compatible HDF5 files (see example for details), but could miss some important use cases. I'm interested:

  1. Is this implementation clear? Is it convenient?
  2. Does it cover all important use cases? If not, what else should be done?

Specifically, I'm curious to hear @jfsantos 's opinion.

Add support for Persistent Contrastive Divergence

I noticed only contrastive divergence (CD) is supported for training in this library. Do you think it would be too hard to add support for PCD too? The most problematic thing I can imagine is that we would need to keep the previous state for the Gibbs sampler to be able to initialize it from this state instead of the current sample.

Persistent CD is wrong

Hello,

the following implementation of persistent_contdiv doesn't look right to me, am I missing something?

Boltzmann.jl/src/rbm.jl

Lines 214 to 232 in 0625ed0

"""
Persistent contrastive divergence sampler. Options:
* n_gibbs - number of gibbs sampling loops
"""
function persistent_contdiv(rbm::AbstractRBM, vis::Mat, ctx::Dict)
n_gibbs = @get(ctx, :n_gibbs, 1)
persistent_chain = @get_array(ctx, :persistent_chain, size(vis), vis)
if size(persistent_chain) != size(vis)
# persistent_chain not initialized or batch size changed
# re-initialize
persistent_chain = vis
end
# take positive samples from real data
v_pos, h_pos, _, _ = gibbs(rbm, vis)
# take negative samples from "fantasy particles"
persistent_chain, _, v_neg, h_neg = gibbs(rbm, vis, n_times=n_gibbs)
return v_pos, h_pos, v_neg, h_neg
end

The problem is that persistent_chain is local to the function and the :persistent_chain field in ctx is never updated. I'll make a PR to propose the correct implementation.

PS: thanks for the great package! ;)

Tests are too slow

Currently tests run more than 2 hours on Travis, which is unreasonably long. The main cause for this is a brute force approach in which we test every supported element type with every possible visible and hidden unit type for both - dense and sparse input - plus different sets of options. The principal question is how to exclude most flows, but keep coverage high. Any suggestions are welcome.

CRBM momentum only applied to W

I have seen that your function only seems to update dW for the weights but not for all the other parameters. Is it what you intented to do?

function grad_apply_momentum!(crbm::ConditionalRBM{T}, X::Mat{T},
                              dtheta::Tuple, ctx::Dict) where T
    dW, dA, dB, db, dc = dtheta
    momentum = @get(ctx, :momentum, 0.9)
    dW_prev = @get_array(ctx, :dW_prev, size(dW), zeros(T, size(dW)))
    # same as: dW += momentum * dW_prev
    axpy!(momentum, dW_prev, dW)
end

Implicit conversion of data types

From #19:

It would be nice if the precision was a bit more flexible. I might want an Float32 RBM, but it isn't maintainable for me to update all of my code to convert Float64 context variables and input data to Float32.

It sounds like a good idea, especially in a context of fitting large datasets when a user may not have possibility to convert an entire dataset in memory, but converting it on per-batch basis is still possible. However, I have some concerns reagarding this idea.

  • This is an implicit conversion, and explicit is better then implicit. If, for example, a user created Float64 RBM, but got Float32 data, we would not warn him about it. As an example, Julia's BLAS function never allow implicit conversions and always demand an exact type.
  • It's unclear which functions we should allow to implicitely convert data types. Should it be only fit() or all exported functions or all functions at all?
  • Such a change is error-prone from performance perspective. Say, we put code for implicit conversion into a sample() function and then some overloaded fit() method passes a data of a different type into it - this will lead to the conversion during each call to sample(), which will significantly slow down the process without any hints for a developer / end user.

Does it make sence or am I just paranoid?

cc: @Rory-Finnegan

[PkgEval] Boltzmann may have a testing issue on Julia 0.4 (2015-06-30)

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.4

  • On 2015-06-27 the testing status was Tests pass.
  • On 2015-06-30 the testing status changed to Tests fail.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

>>> 'Pkg.add("Boltzmann")' log
INFO: Cloning cache of Boltzmann from git://github.com/dfdx/Boltzmann.jl.git
INFO: Installing ArrayViews v0.6.2
INFO: Installing BinDeps v0.3.12
INFO: Installing Blosc v0.1.2
INFO: Installing Boltzmann v0.2.1
INFO: Installing Distributions v0.7.3
INFO: Installing HDF5 v0.4.17
INFO: Installing PDMats v0.3.3
INFO: Installing SHA v0.0.4
INFO: Installing StatsBase v0.6.15
INFO: Installing URIParser v0.0.5
INFO: Building Blosc
INFO: Building HDF5
WARNING: beginswith is deprecated, use startswith instead.
 in depwarn at ./deprecated.jl:62
 in beginswith at deprecated.jl:30
 in available_versions at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:116
 in package_available at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:111
 in can_provide at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:608
 in _find_library at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:451
 in satisfy! at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:738 (repeats 2 times)
 in anonymous at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:793
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in evalfile at loading.jl:175 (repeats 2 times)
 in anonymous at pkg/entry.jl:652
 in cd at ./file.jl:22
 in build! at pkg/entry.jl:651
 in build! at pkg/entry.jl:646
 in build at pkg/entry.jl:663
 in resolve at ./pkg/entry.jl:472
 in edit at pkg/entry.jl:26
 in anonymous at task.jl:365
while loading /home/vagrant/.julia/v0.4/HDF5/deps/build.jl, in expression starting on line 24
WARNING: beginswith is deprecated, use startswith instead.
 in depwarn at ./deprecated.jl:62
 in beginswith at deprecated.jl:30
 in available_versions at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:124
 in package_available at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:111
 in can_provide at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:608
 in _find_library at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:451
 in satisfy! at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:738 (repeats 2 times)
 in anonymous at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:793
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in evalfile at loading.jl:175 (repeats 2 times)
 in anonymous at pkg/entry.jl:652
 in cd at ./file.jl:22
 in build! at pkg/entry.jl:651
 in build! at pkg/entry.jl:646
 in build at pkg/entry.jl:663
 in resolve at ./pkg/entry.jl:472
 in edit at pkg/entry.jl:26
 in anonymous at task.jl:365
while loading /home/vagrant/.julia/v0.4/HDF5/deps/build.jl, in expression starting on line 24
INFO: Package database updated

>>> 'Pkg.test("Boltzmann")' log
Julia Version 0.4.0-dev+5700
Commit 147fa0b* (2015-06-29 20:31 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
INFO: Computing test dependencies for Boltzmann...
INFO: Cloning cache of Logging from git://github.com/kmsquire/Logging.jl.git
INFO: Cloning cache of MNIST from git://github.com/johnmyleswhite/MNIST.jl.git
INFO: Cloning cache of Mocha from git://github.com/pluskid/Mocha.jl.git
INFO: Installing Logging v0.1.1
INFO: Installing MNIST v0.0.1
INFO: Installing Mocha v0.0.8
INFO: Building Blosc
INFO: Building HDF5
WARNING: beginswith is deprecated, use startswith instead.
 in depwarn at ./deprecated.jl:62
 in beginswith at deprecated.jl:30
 in available_versions at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:116
 in package_available at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:111
 in can_provide at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:608
 in _find_library at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:451
 in satisfy! at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:738 (repeats 2 times)
 in anonymous at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:793
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in evalfile at loading.jl:175 (repeats 2 times)
 in anonymous at pkg/entry.jl:652
 in cd at ./file.jl:22
 in build! at pkg/entry.jl:651
 in build! at pkg/entry.jl:646
 in build at pkg/entry.jl:663
 in resolve at ./pkg/entry.jl:472
 in test! at pkg/entry.jl:712
 in test at pkg/entry.jl:740
 in anonymous at pkg/dir.jl:31
 in cd at file.jl:22
 in cd at pkg/dir.jl:31
 in test at pkg.jl:71
 in process_options at ./client.jl:281
 in _start at ./client.jl:405
while loading /home/vagrant/.julia/v0.4/HDF5/deps/build.jl, in expression starting on line 24
WARNING: beginswith is deprecated, use startswith instead.
 in depwarn at ./deprecated.jl:62
 in beginswith at deprecated.jl:30
 in available_versions at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:124
 in package_available at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:111
 in can_provide at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:608
 in _find_library at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:451
 in satisfy! at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:738 (repeats 2 times)
 in anonymous at /home/vagrant/.julia/v0.4/BinDeps/src/dependencies.jl:793
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in evalfile at loading.jl:175 (repeats 2 times)
 in anonymous at pkg/entry.jl:652
 in cd at ./file.jl:22
 in build! at pkg/entry.jl:651
 in build! at pkg/entry.jl:646
 in build at pkg/entry.jl:663
 in resolve at ./pkg/entry.jl:472
 in test! at pkg/entry.jl:712
 in test at pkg/entry.jl:740
 in anonymous at pkg/dir.jl:31
 in cd at file.jl:22
 in cd at pkg/dir.jl:31
 in test at pkg.jl:71
 in process_options at ./client.jl:281
 in _start at ./client.jl:405
while loading /home/vagrant/.julia/v0.4/HDF5/deps/build.jl, in expression starting on line 24
INFO: Building Mocha
Running `g++ -fPIC -Wall -O3 -shared -fopenmp -o libmochaext.so im2col.cpp pooling.cpp`
INFO: Testing Boltzmann
Warning: could not import Base.@math_const into Distributions
ERROR: LoadError: LoadError: LoadError: LoadError: LoadError: LoadError: LoadError: UndefVarError: @math_const not defined
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:55
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:52
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in include at ./boot.jl:254
 in include_from_node1 at loading.jl:133
 in process_options at ./client.jl:305
 in _start at ./client.jl:405
while loading /home/vagrant/.julia/v0.4/Distributions/src/constants.jl, in expression starting on line 6
while loading /home/vagrant/.julia/v0.4/Distributions/src/Distributions.jl, in expression starting on line 252
while loading /home/vagrant/.julia/v0.4/Boltzmann/src/rbm.jl, in expression starting on line 2
while loading /home/vagrant/.julia/v0.4/Boltzmann/src/core.jl, in expression starting on line 2
while loading /home/vagrant/.julia/v0.4/Boltzmann/src/Boltzmann.jl, in expression starting on line 15
while loading /home/vagrant/.julia/v0.4/Boltzmann/test/testrbm.jl, in expression starting on line 2
while loading /home/vagrant/.julia/v0.4/Boltzmann/test/runtests.jl, in expression starting on line 2
==============================[ ERROR: Boltzmann ]==============================

failed process: Process(`/home/vagrant/julia/bin/julia --check-bounds=yes --code-coverage=none --color=no /home/vagrant/.julia/v0.4/Boltzmann/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
INFO: Removing Logging v0.1.1
INFO: Removing MNIST v0.0.1
INFO: Removing Mocha v0.0.8
ERROR: Boltzmann had test errors
 in error at ./error.jl:21
 in test at pkg/entry.jl:746
 in anonymous at pkg/dir.jl:31
 in cd at file.jl:22
 in cd at pkg/dir.jl:31
 in test at pkg.jl:71
 in process_options at ./client.jl:281
 in _start at ./client.jl:405

>>> End of log

Code Coverage

It would nice to keep track of test coverage. Especially, given the number paths available with the newly refactored code. I'm a fan of codecov, but coveralls also works.

Package Status?

Are folks actively using this pkg? I haven't been using RBMs much lately, but I'd be happy to add some fixes for 0.5 and 0.6 if there was still demand.

Incorrect Momentum Implementation

Previously, a bug existed in update_weights! where the learning rate was multiplied against the momentum when calculating the per-mini-batch updates to the RBM weights. This was corrected in the last pull request from @marylou-gabrie.

However, this change has uncovered another bug. I was finding in practice that both CD and PCD failed to converge (or decrease the pseudo-likelihood at all) for non-negligible momentum values (i.e. values above 0.1 or so). After investigating the component values, I found them to be quite off.

It turns out that the contributions of the last weight gradient was not contributing to the future weight gradient, as the last weight gradient was only getting added (scaled by momentum of course) to the current set of weights, but was not getting added to gradient itself, which is later copied to the RBM structure for future use.

After making the change I reference below, I was able to get reasonable results. The reason why this issue never appeared before was that the with a small learning rate (i.e. a default value around 0.1) which is further scaled by the training batch size, when the momentum was incorrectly scaled by this term the contribution from this incorrect implementation was negligible and therefore the training errors were not apparent. Only after allowing the momentum contribution to become non-neglible by taking away the incorrect learning rate scaling did the error surface.

The good news is that it was a quick fix.

Original

function update_weights!(rbm, h_pos, v_pos, h_neg, v_neg, lr, buf)
    dW = buf
    # dW = (h_pos * v_pos') - (h_neg * v_neg')
    gemm!('N', 'T', 1.0, h_neg, v_neg, 0.0, dW)
    gemm!('N', 'T', 1.0, h_pos, v_pos, -1.0, dW)
    # rbm.W += lr * dW
    axpy!(lr, dW, rbm.W)
    # rbm.W += rbm.momentum * rbm.dW_prev
    axpy!(rbm.momentum, rbm.dW_prev, rbm.W) 
    # save current dW
    copy!(rbm.dW_prev, dW)
end

Modified

function update_weights!(rbm, h_pos, v_pos, h_neg, v_neg, lr, buf)
    dW = buf
    # dW = (h_pos * v_pos') - (h_neg * v_neg')
    gemm!('N', 'T', lr, h_neg, v_neg, 0.0, dW)
    gemm!('N', 'T', lr, h_pos, v_pos, -1.0, dW)
    # rbm.dW += rbm.momentum * rbm.dW_prev
    axpy!(rbm.momentum, rbm.dW_prev, dW)
    # rbm.W += lr * dW
    axpy!(1.0, dW, rbm.W)
    # save current dW
    copy!(rbm.dW_prev, dW)
end

Momentum implementation

In rbm.jl, the implementation of the momentum does not seem right (function update_weights!). The previous weight increment, rbm.dW_prev, gets multiplied both by the momentum and the learning rate. Nevertheless, it seems it should only be multiplied by the momentum (cf Eq. 12 of G. Hinton's Practical Guide ) - considering values of the momentum between 0.5 and 0.9.

Support for deep belief networks

Hi,

I started implementing some functions to train a DBN based on your RBM implementations. It's kind of a hack right now but I think the API can be improved and the functions can be generalized a little bit.

My repository is here in case you want to take a look. I actually have an updated version for which only the first RBM is Gaussian-Bernoulli and all the others are Bernoulli, but I still did not push it to GitHub. If you think this is a nice fit for this library, I will start implementing it in a branch of Boltzmann.jl instead of using my own library.

LinearAlgebra dependency

Just installed Boltzmann today. I get the following warning:

┌ Warning: Package Boltzmann does not have LinearAlgebra in its dependencies:
│ - If you have Boltzmann checked out for development and have
│   added LinearAlgebra as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with Boltzmann

hi

I have used this package a year back.. Now , I would like to work on it again. I would like to develop a dae as below. Is this architecture possible with this package. I have designed the same architecture in matlab. I saw the unroll(dbn) function . but not sure to design this in julia. Any suggesstion would help me

screenshot-2017-9-1 a deep learning approach to machine transliteration - p233-deselaers pdf

Documentation

It would probably be helpful for all exported methods and types to have doc strings.

Several issues with mnist example

I'm trying this package for the first time and am running into several problems with the mnist example.

WARNING: Method definition fit(Boltzmann.ConditionalRBM, AbstractArray{#T<:Any, 2}) in module Boltzmann at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/conditional.jl:251 overwritten at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/conditional.jl:279.
WARNING: both ImageView and Base export "view"; uses of it in module Main must be qualified

The following error can be solved by adding
opts = convert(Dict{Any,Any},opts) in function fit{T}(rbm::RBM{T}, X::Mat, opts = Dict{Any,Any}())
and
ctx = convert(Dict{Any,Any},ctx) in function fit{T}(crbm::ConditionalRBM, X::Mat{T}, ctx = Dict{Any,Any}())

ERROR: LoadError: MethodError: Cannot `convert` an object of type Boltzmann.#pseudo_likelihood to an object of type Integer
This may have arisen from a call to the constructor Integer(...),
since type constructors fall back to convert methods.
 in setindex!(::Dict{Symbol,Integer}, ::Function, ::Symbol) at ./dict.jl:634
 in macro expansion at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/utils.jl:37 [inlined]
 in fit(::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}, ::Dict{Symbol,Integer}) at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/rbm.jl:317
 in (::StatsBase.#kw##fit)(::Array{Any,1}, ::StatsBase.#fit, ::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}) at ./<missing>:0
 in run_mnist() at /local/home/fredrikb/.julia/v0.5/Boltzmann/examples/mnistexample.jl:31

The following error can be solved by qualifying calls to axpy! like Base.LinAlg.axpy!

LoadError: UndefVarError: axpy! not defined
 in grad_apply_momentum!(::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}, ::Tuple{Array{Float64,2},Array{Float64,1},Array{Float64,1}}, ::Dict{Any,Any}) at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/rbm.jl:233
 in update_classic!(::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}, ::Tuple{Array{Float64,2},Array{Float64,1},Array{Float64,1}}, ::Dict{Any,Any}) at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/rbm.jl:287
 in fit_batch!(::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}, ::Dict{Any,Any}) at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/rbm.jl:301
 in macro expansion at /local/home/fredrikb/.julia/v0.5/Boltzmann/src/rbm.jl:328 [inlined]
 in macro expansion at ./util.jl:226 [inlined]
 in fit(::Boltzmann.RBM{Float64,Boltzmann.Degenerate,Distributions.Bernoulli}, ::Array{Float64,2}, ::Dict{Symbol,Integer}) at /lo...

convert errors

I'm using Julia v0.5 and encounter the following problem while calling "RBM" function

MethodError: Cannot convert an object of type Tuple{Int64,Int64} to an object of type Array{Float64,1}
This may have arisen from a call to the constructor Array{Float64,1}(...),
since type constructors fall back to convert methods.
in #RBM#2(::Float64, ::Type{T}, ::Type{T}, ::Type{T}, ::Type{T}, ::Int64, ::Int64) at /Users/cary/.julia/v0.5/Boltzmann/src/rbm.jl:69
in Boltzmann.RBM{T,V,H}(::Type{T}, ::Type{T}, ::Type{T}, ::Int64, ::Int64) at /Users/cary/.julia/v0.5/Boltzmann/src/rbm.jl:69

this can be solved by changing how the "map" function acts in "RBM" function

function RBM(T::Type, V::Type, H::Type,
             n_vis::Int, n_hid::Int; sigma=0.01)
    RBM{T,V,H}(map(T, rand(Normal(0, sigma), [n_hid, n_vis]...)),
             zeros(n_vis), zeros(n_hid))
end

Refactoring

As recently discussed in #9, it will be nice to refactor the project to better reflect different kinds of RBMs and different ways to learn it. Currently I can think of the following variations:

RBM kinds

  • ordinary RBMs (with Bernoulli and Gaussian variables, as well as using pure probabilities for visible layer)
  • Conditional RBMs
  • CLRBM - implementation of RBM for OpenCL (TBD)
  • I also have particular interest in implementing original Harmonium by Paul Smolensky as it may work better for very large RBMs with sparse weight matrix

gradient calculation

  • contrastive divergence
  • persistent contrastive divergence
  • EMF approximation by @sphinxteam

weight update options

  • learning rate
  • momentum
  • weight decay
  • sparsity

Given these requirements, I imagine minimal RBM setup to look something like this (for all but CLRBM types):

type RBM
    W::Matrix{Float64}
    vbias::Vector{Float64}
    hbias::Vector{FLoat64}
    gradient::Function
    update!::Function
end

where gradient() and update!() are closures initialized with all needed options. Then a single method fit() can use these closures to learn RBM and provide monitoring.

Note that this doesn't restrict implementations to the single type, but instead provides a contract, i.e. specifies fields required to provide compatibility between different methods for learning and inference.

@Rory-Finnegan @eric-tramel Does it makes sense for you? Do your models fit into this approach?

Learning rate scaling with visible units

In rbm.jl, the learning rate gets rescaled by the number of visible units in the function fit_batch!.
Is there any particular reason why ?
If not, this operation seems a little unexpected and should maybe be removed. Users can directly rescale their learning rate if they need it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.