Giter Club home page Giter Club logo

autonn's Introduction

AutoNN

AutoNN is a functional wrapper for MatConvNet, implementing automatic differentiation.

It builds on MatConvNet's low-level functions and Matlab's math operators, to create a modern deep learning API with automatic differentiation at its core. The guiding principles are:

  • Concise syntax for fast research prototyping, mixing math and deep network blocks freely.
  • No boilerplate code to create custom layers, implemented as Matlab functions operating on GPU arrays.
  • Minimal execution kernel for backpropagation, with a focus on speed.

Compared to the SimpleNN and DagNN wrappers for MatConvNet, AutoNN is less verbose and has lower computational overhead.

Requirements

  • A recent Matlab (preferably 2016b onwards, though older versions may also work).
  • MatConvNet (preferably the most recent version, though others may still work).

AutoNN in a nutshell

Defining an objective function with AutoNN is as simple as:

% define inputs and learnable parameters
x = Input() ;
y = Input() ;
w = Param('value', randn(1, 100)) ;
b = Param('value', 0) ;

% combine them using math operators, which define the prediction
prediction = w * x + b ;

% define a loss
loss = sum(sum((prediction - y).^2)) ;

% compile and run the network
net = Net(loss) ;
net.eval({x, rand(100, 1), y, 0.5}) ;

% display parameter derivatives
net.getDer(w)

AutoNN also allows you to use MatConvNet layers and custom functions.

Here's a simplified 20-layer ResNet:

images = Input() ;

% initial convolution
x = vl_nnconv(images, 'size', [3 3 3 64], 'stride', 4) ;

% iterate blocks
for k = 1:20
  % compose a residual block, based on the previous output
  res = vl_nnconv(x, 'size', [3 3 64 64], 'pad', 1) ;  % convolution
  res = vl_nnbnorm(res) ;  % batch-normalization
  res = vl_nnrelu(res) ;  % ReLU
  
  % add it to the previous output
  x = x + res ;
end

% pool features across spatial dimensions, and do final prediction
pooled = mean(mean(x, 1), 2) ;
prediction = vl_nnconv(pooled, 'size', [1 1 64 1000]) ;

All of MatConvNet's layer functions are overloaded, as well as a growing list of Matlab math operators and functions. The derivatives for these functions are defined whenever possible, so that they can be composed to create differentiable models. A full list can be found here.

Finally, there are several classes to aid training, such as standard datasets, solvers, models, and statistics plotting. It is easy to mix them, and you retain full control over the training loop. For example:

% load dataset
dataset = datasets.CIFAR10('/data/cifar') ;

% create solver
solver = solvers.Adam() ;
solver.learningRate = 0.0001 ;

for epoch = 1:100  % iterate epochs
  for batch = dataset.train()  % iterate batches
    % draw samples
    [images, labels] = dataset.get(batch) ;

    % evaluate network to compute gradients
    net.eval({'images', images, 'labels', labels}) ;

    % take one gradient descent step
    solver.step(net) ;
  end
end

Documentation

Tutorial

The easiest way to learn more is to follow this short tutorial. It covers all the basic concepts and a good portion of the API.

Help pages

Comprehensive documentation is available by typing help autonn into the Matlab console. This lists all the classes and methods, with short descriptions, and provides links to other help pages.

Converting SimpleNN/DagNN models

For a quicker start or to load pre-trained models, you may want to import them from the existing wrappers. Check help Layer.fromDagNN.

Examples

The examples directory has heavily-commented samples. These can be grouped in two categories:

  • The minimal examples (in examples/minimal) are very short and self-contained. They are scripts so you can inspect and explore the resulting variables in the command window.

  • The full examples (in examples/cnn and examples/rnn) demonstrate usage of the AutoNN training packages. These include several standard solvers (e.g. Adam, AdaGrad), CNN models (including automatically downloading pre-trained models), and datasets (e.g. ImageNet, CIFAR-10). You can override the parameters on the command window, and experiment with different models and settings.

Screenshots

Some gratuitous screenshots, though the important bits are in the code above really:

Training diagnostics plot

Diagnostics

Graph topology plot

Graph

Authors

AutoNN was developed by João F. Henriques at the Visual Geometry Group (VGG), University of Oxford.

We gratefully acknowledge contributions by: Sam Albanie, Ryan Webster, Ankush Gupta, David Novotny, Aravindh Mahendran, Stefano Woerner.

autonn's People

Contributors

albanie avatar davnov134 avatar jotaf98 avatar ryanwebster90 avatar stefanowoerner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autonn's Issues

Error running minimal examples

Edit: I updated MatConvNet from https://github.com/vlfeat/matconvnet now it works.

I get error running minimal_lstm and minimal network.

minimal_lstm
Index exceeds matrix dimensions.

Error in vl_argparse (line 94)
value = args{ai+1} ;

Error in vl_nnloss (line 134)
opts = vl_argparse(opts, varargin, 'nonrecursive') ;

Error in Net/eval (line 86)
[out{:}] = layer.func(args{:}) ;

Error in minimal_lstm (line 73)
net.eval({'x', data_x, 'y', data_y}) ;

////////////////////////////

minimal_network
Index exceeds matrix dimensions.

Error in vl_argparse (line 94)
value = args{ai+1} ;

Error in vl_nnloss (line 134)
opts = vl_argparse(opts, varargin, 'nonrecursive') ;

Error in Net/eval (line 86)
[out{:}] = layer.func(args{:}) ;

Error in minimal_network (line 42)
net.eval({'x', data_x(:,:,:,idx), 'y', data_y(idx)}) ;

I am using MatConvNet version 24, win 64, Matlab 2016a

Thank you so much for your help in advance.

autograd/tf.gradients - is there support?

Hello,
I'm trying to implement in MatConvLab/autonn this network implemented here in PyTorch and here in Tenserflow. I need to define a network that uses gradients on-the-fly to calculate some other gradients and updates.
image
Note that access to 2nd order derivative (of loss) is needed at construction time, since

Loss(S; Θ) = F(S_bar; Θ) + β * G(S_breve; Θ') = F(S_bar; Θ) + β * G(S_breve; Θ - α * ∂F(S_bar; Θ)/∂Θ)
∂Loss(S; Θ)/∂Θ = ∂F(S_bar; Θ)/∂Θ + β * ∂G(S_breve; Θ')/∂Θ
G(Θ')/∂Θ = ∂G(Θ')/∂Θ' * ∂Θ'/∂Θ, and
∂Θ'/∂Θ = 1 - α * ∂²F(Θ)/∂Θ²
∂Loss(S; Θ)/∂Θ = ∂F(S_bar; Θ)/∂Θ + β * ∂G(S_breve; Θ')/∂Θ' - α*β * ∂G(S_breve; Θ')/∂Θ' * ∂²F(Θ)/∂Θ²

Calculating ∂F(S_bar; Θ)/∂Θ and ∂G(S_breve; Θ')/∂Θ' is simple: just run F(S_bar; Θ) and G(S_breve; Θ') backwards. The problem is how to obtain ∂²F(Θ)/∂Θ²?

In PyTorch, the crucial step is implemented here using autograd.grad during network assembly (before compile!), s.t. proper differentiation occurs during backpropagation. Here is a Tenserflow implementation using tf.gradients.

Can this be done in autonn and if so - how?
Thx

PS From what I understand getDer and setDer are run-time methods that provide access to numeric value of a derivative. I need to use 2nd order derivative in network construction, so access to 1st order derivative at build-time is needed.

Selective Weight Vector for Loss Func

Hi @jotaf98,
how can we get max score indices in logits? I want to calculate a verification loss for a classified person.
For example, let scr is logits of network , for a person, fea_vect is multiplied with corresponding vector W(maxInd , : )

[~,maxInd] = max( scr , [ ] , 3);
loss_2 = (tanh(W(maxInd , : ) * fea_vect) - lbl).^2 where W = Param('value', randn(numPers,feaVectLength))

But I've got an error using max :
Error using Layer/max
Too many output arguments.

Also is W trainable over corresponding vectors ?

An Error usage of vl_nnaxpy in SE-ResNet-50-mcn

Hi, @jotaf98 and @albanie
now I'm trying to use SE-ResNet-50-mcn, SE-ResNeXt-50-32x4d-mcn, SE-BN-Inception-mcn networks. SE-BN-Inception-mcn works well but ResNet-type netwoks gives an error on vl_nnaxpy
When I look in debug mode, I see that "out" and nnaxpy have same output size, but cannot be assigned.

image

I also tried something but couldnt do anything. Here is the error..

image

Non static matrix creation

I'm under the intention that matrix creation operators (e.g. Layer.randn) should be non static, because they are often used with size. For example

x = Input() ;
b = ones(size(x,2),1);
y = x*b;

Modifying functions to accommodate this reduces the drop in usability of autonn.

Does it lost the function 'vl_argparse'?

When I clone this prj and try to test an Input() function, at Line 37 in 'matlab\Input.m', it calls the subfunction named 'vl_argparse' and it figures out that the function is undefined. The similar function 'vl_argparsepos' (and the other parts) also has such a problem! So i guess maybe u forget pushing it sometime? :)

Copy weights

Hi
What is the easiest way to copy weights between two AutoNN networks?
Best Regards

KL

My recently working is using Deeplearning to solve the tracking problem, I need to combine KL divergence with my loss function, what can I do to achieve the KL divergence's forward and backward propagation by autonn.

Dead Link : Math Functions

A minor issue but there is a dead link on the tutorial page under Math Functions :
" The full list of overloads can be seen with methods('Layer'), or here. "

An error with optimizeGraph function

I found that optimizeGraph function occurs a problem when it is used with workspaceNames function.
Under certain circumstances, it seems that optimizeGraph negates original layers' names defined by workspaceNames function.

Below is the simplified code that can reproduce the error.
You can run it section-wise, and it will generate the error in the second section.

clear all; clc;

%% Error doesn't occur
S = Input('name','S');

loss_A = sum((S-1).^2,1);
loss_B = sum((S-2).^2,1);
loss_T = mean(loss_A + loss_B, 2);

Layer.workspaceNames();

myNet = Net(loss_T);
myNet.eval({'S',10});
find(contains({myNet.forward.name},'loss_A'))
myNet.getValue('loss_A')

%% Error occurs
% It can be fixed when we configure [opts.optimizeGraph = false] in the function [@Net\compile], or when we put ( - ) sign into the sum function.

S = Input('name','S');

loss_A = - sum((S-1).^2,1);
loss_B = - sum((S-2).^2,1);
loss_T = mean(loss_A + loss_B, 2);

Layer.workspaceNames();

myNet = Net(loss_T);
myNet.eval({'S',10});
find(contains({myNet.forward.name},'loss_A'))
myNet.getValue('loss_A')

As I described on the code, the error can be fixed with either by configuring opts.optimizeGraph = false in the function @net\compile, or by migrating the - signs into the inside of the sum functions.

But it would be a clearer solution that we retain the original layer name by putting some additional lines in the optimizeGraph function.

Different results of AutoNN and DagNN?

Hi all.

I used DagNN to train my NN for application of image auto white balancing and achieved the final objective of approximately 0.1 (Euclidean distance with pdist function).

Yesterday I found AutoNN an impressive wrapper for MatConvNet and rebuilt my architecture in AutoNN. Using the same layers constructures and initializing with the same parameters, but the objective is 6~8 after same number of epoches.

Any idea about what mistakes I make will be appreciated.

Here are codes for net constructure in DagNN and AutoNN:

# DagNN code
opts.batchSize = [];
opts.imageSize = [384 384];
opts.averageImage = zeros(3,1) ;
opts.colorDeviation = zeros(3) ;
opts.cudnnWorkspaceLimit = 4*1024*1024*1204 ; % 4GB
opts = vl_argparse(opts, varargin) ;

net = dagnn.DagNN() ;

% -------------------------------------------------------------------------
% Add input section
% -------------------------------------------------------------------------

% Block #1
net.addLayer('conv1',...
             dagnn.Conv('size', [1 1 3 8], 'hasBias', true, 'stride', [1 1], 'pad', [0 0 0 0]),...
             {'inputimage'},...
             {'conv1'},...
             {'conv1f'  'conv1b'}); 
net.addLayer('relu1',...
             dagnn.ReLU(),...
             {'conv1'},...
             {'relu1'},...
             {});
net.addLayer('pool1',...
             dagnn.Pooling('method', 'max', 'poolSize', [2 2], 'stride', [2 2], 'pad', [0 0 0 0]),...
             {'relu1'},...
             {'pool1'},...
             {});

% Block #2
net.addLayer('conv2',...
             dagnn.Conv('size', [5 5 8 32], 'hasBias', true, 'stride', [1 1], 'pad', [2 2 2 2]),...
             {'pool1'},...
             {'conv2'},...
             {'conv2f'  'conv2b'});
net.addLayer('relu2',...
             dagnn.ReLU(),...
             {'conv2'},...
             {'relu2'},...
             {});
net.addLayer('pool2',...
             dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]),...
             {'relu2'},...
             {'pool2'},...
             {});

% Block #3
net.addLayer('conv3',...
             dagnn.Conv('size', [3 3 32 128], 'hasBias', true, 'stride', [3 3], 'pad', [3 3 3 3]),...
             {'pool2'},...
             {'conv3'},...
             {'conv3f'  'conv3b'});
net.addLayer('relu3',...
             dagnn.ReLU(),...
             {'conv3'},...
             {'relu3'},...
             {});
net.addLayer('pool3',...
             dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]),...
             {'relu3'},...
             {'pool3'},...
             {});

% Block #4
net.addLayer('conv4',...
             dagnn.Conv('size', [1 1 128 256], 'hasBias', true, 'stride', [2 2], 'pad', [0 0 0 0]),...
             {'pool3'},...
             {'conv4'},...
             {'conv4f'  'conv4b'});
net.addLayer('relu4',...
             dagnn.ReLU(),...
             {'conv4'},...
             {'relu4'},...
             {});

% Block #5
net.addLayer('conv5',...
             dagnn.Conv('size', [9 9 256 64], 'hasBias', true, 'stride', [1 1], 'pad', [0 0 0 0]),...
             {'relu4'},...
             {'conv5'},...
             {'conv5f'  'conv5b'}); 
net.addLayer('relu5',...
             dagnn.ReLU(),...
             {'conv5'},...
             {'relu5'},...
             {});

% Block #6
net.addLayer('cat1',...
             dagnn.Concat('dim', 3),...
             {'relu5', 'inputsensor', 'inputgyro'},...
             {'cat1'});
         
% Block #7: Muli-Layer-Perceptron
net.addLayer('fc1',...
             dagnn.Conv('size', [1 1 73 512], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]),...
             {'cat1'},...
             {'fc1'},...
             {'conv6f'  'conv6b'});
net.addLayer('relu6',...
             dagnn.ReLU(),...
             {'fc1'},...
             {'relu6'},...
             {});

% Block #8
net.addLayer('prediction',...
             dagnn.Conv('size', [1 1 512 2], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]),...
             {'relu6'},...
             {'prediction'},...
             {'conv7f'  'conv7b'});

% Block #9: pdist  
net.addLayer('objective',...
             dagnn.PDist('p', 2, 'aggregate', true),...
             {'prediction', 'label'},...
             {'objective'},...
             {}); 

% -------------------------------------------------------------------------
%                                                           Meta parameters
% -------------------------------------------------------------------------

net.meta.imageSize = opts.imageSize ;
net.meta.averageImage = opts.averageImage ;

lr = [0.001*ones(1,3), 0.0005*ones(1,3), 0.0001*ones(1,3), 0.00005*ones(1,3), 0.00001*ones(1,5)] ;
net.meta.trainOpts.learningRate = lr ;
net.meta.trainOpts.numEpochs = numel(lr) ;
net.meta.trainOpts.momentum = 0.9;
net.meta.trainOpts.batchSize = opts.batchSize ;
net.meta.trainOpts.numSubBatches = 1 ;
net.meta.trainOpts.weightDecay = 0.0001 ;

# params init
f = 1/100;
f_ind = net.layers(1).paramIndexes(1);                                             
b_ind = net.layers(1).paramIndexes(2);                                             
net.params(f_ind).value = 10*f*randn(size(net.params(f_ind).value), 'single');     
net.params(f_ind).learningRate = 1;                                                
net.params(f_ind).weightDecay = 1;                                                 
for l=2:length(net.layers)                                                         
	if(strcmp(class(net.layers(l).block), 'dagnn.Conv'))                           
		f_ind = net.layers(l).paramIndexes(1);                                     
		b_ind = net.layers(l).paramIndexes(2);
		[h,w,in,out] = size(net.params(f_ind).value);
		net.params(f_ind).value = f*randn(size(net.params(f_ind).value), 'single');
		net.params(f_ind).learningRate = 1;                                        
		net.params(f_ind).weightDecay = 1;                                         
		net.params(b_ind).value = f*randn(size(net.params(b_ind).value), 'single');
		net.params(b_ind).learningRate = 0.5;  
		net.params(b_ind).weightDecay = 1;
	end
end
# AutoNN code
opts.batchSize = 50;
opts.imageSize = [384 384];
opts.averageImage = zeros(3,1) ;
opts.colorDeviation = zeros(3) ;
opts.cudnnWorkspaceLimit = 4*1024*1024*1204 ; % 4GB
opts.learningRate = [0.001*ones(1,3), 0.0005*ones(1,3), 0.0001*ones(1,3), 0.00005*ones(1,3), 0.00001*ones(1,5)] ;
opts = vl_argparse(opts, varargin) ;

f = 1/100; % initialization parameter
% -------------------------------------------------------------------------
% Add input section
% -------------------------------------------------------------------------

inputimage = Input();
inputsensor = Input();
inputgyro = Input();
label = Input();

% Block #1
% create parameters explicitly
filterSize1 = [1 1 3 8];
filters1 = Param('value', 10*f*randn(filterSize1(1),filterSize1(2),filterSize1(3),filterSize1(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases1 = Param('value', zeros(filterSize1(4), 1, 'single'), 'learningRate', 1, 'weightDecay', 1);
conv1 = vl_nnconv(inputimage, filters1, biases1, 'stride', [1 1], 'pad', [0 0 0 0]);
relu1 = vl_nnrelu(conv1);
pool1 = vl_nnpool(relu1, 2, 'stride', 2);

% Block #2
filterSize2 = [5 5 8 32];
filters2 = Param('value', f*randn(filterSize2(1),filterSize2(2),filterSize2(3),filterSize2(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases2 = Param('value', f*randn(1,filterSize2(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
conv2 = vl_nnconv(pool1, filters2, biases2, 'stride', [1 1], 'pad', [2 2 2 2]);
relu2 = vl_nnrelu(conv2);
pool2 = vl_nnpool(relu2, 2, 'stride', 2);

% Block #3
filterSize3 = [3 3 32 128];
filters3 = Param('value', f*randn(filterSize3(1),filterSize3(2),filterSize3(3),filterSize3(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases3 = Param('value', f*randn(1, filterSize3(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
conv3 = vl_nnconv(pool2, filters3, biases3, 'stride', [3 3], 'pad', [3 3 3 3]);
relu3 = vl_nnrelu(conv3);
pool3 = vl_nnpool(relu3, 2, 'stride', 2);

% Block #4
filterSize4 = [1 1 128 256];
filters4 = Param('value', f*randn(filterSize4(1),filterSize4(2),filterSize4(3),filterSize4(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases4 = Param('value', f*randn(1, filterSize4(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
conv4 = vl_nnconv(pool3, filters4, biases4, 'stride', [2 2], 'pad', [0 0 0 0]);
relu4 = vl_nnrelu(conv4);

% Block #5
filterSize5 = [9 9 256 64];
filters5 = Param('value', f*randn(filterSize5(1),filterSize5(2),filterSize5(3),filterSize5(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases5 = Param('value', f*randn(1, filterSize5(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
conv5 = vl_nnconv(relu4, filters5, biases5, 'stride', [1 1], 'pad', [0 0 0 0]);
relu5 = vl_nnrelu(conv5);

% Block #6: concat
cat6 = cat(3, relu5, inputsensor, inputgyro);

% Block #7: Muli-Layer-Perceptron
filterSize7 = [1 1 73 512];
filters7 = Param('value', f*randn(filterSize7(1),filterSize7(2),filterSize7(3),filterSize7(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases7 = Param('value', f*randn(1, filterSize7(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
fc7 = vl_nnconv(cat6, filters7, biases7, 'stride', [1 1], 'pad', [0 0 0 0]);
relu7 = vl_nnrelu(fc7);

% Block #8: prediction
filterSize8 = [1 1 512 2];
filters8 = Param('value', f*randn(filterSize8(1),filterSize8(2),filterSize8(3),filterSize8(4), 'single'), 'learningRate', 1, 'weightDecay', 1);
biases8 = Param('value', f*randn(1, filterSize8(4), 'single'), 'learningRate', 0.5, 'weightDecay', 1);
prediction8 = vl_nnconv(relu7, filters8, biases8, 'stride', [1 1], 'pad', [0 0 0 0]);

% Block #9: pdist
objective = vl_nnpdist(prediction8, label, 2, 'aggregate', true);

% layers name assignment
Layer.workspaceNames();

% compile the network
inputimage.gpu = true;
net = Net(objective);

net.meta.imageSize = opts.imageSize ;
net.meta.averageImage = opts.averageImage ;

net.meta.trainOpts.learningRate = opts.learningRate ;
net.meta.trainOpts.numEpochs = numel(opts.learningRate) ;
net.meta.trainOpts.momentum = 0.85 ;
net.meta.trainOpts.batchSize = opts.batchSize ;
net.meta.trainOpts.numSubBatches = 1 ;
net.meta.trainOpts.weightDecay = 0.0001 ;

try
    layer = Layer.fromCompiledNet(net);
    layer{1}.sequentialNames;
    layer{1}.plotPDF();
catch
end

How does W and b randomly initialized?

Hi, in the file vl_nnlstm_params.m, line 40-41:

W = Param('value', -noise + 2 * noise * rand(4 * d, d + m, 'single')) ;
 b = Param('value', -noise + 2 * noise * rand(4 * d, 1, 'single'));

where does the dimension of W and b from? It seems W is in 4d x (d+m), and b is 4d x 1? Thanks.

softmaxlog implementation in autonn

Hi @jotaf98, I wanna implement a new loss function in autonn.
Firstly, i've tried to write softmaxlog like :

xc = vl_nnconv(xc,'size', [1, 1, 4096, 2], 'stride', 1,'pad',0);
lbl = Input();

Xmax = max(xc ,[],3);
tmp_vect = xc - Xmax;
exp_vect = exp(tmp_vect);
obj_sum = Xmax + log(sum(exp_vect,3)) - xc(1,1,lbl);

but, i didn't get same results with vl_nnloss softmaxlog. How can i write in autonn?

Does it have fully connected layer?

I found that some of commonly used layers including convolutional layer(vl_nnconv), batch normalization layer(vl_nnbnorm), and activation layer(vl_nnrelu) are defined in separate functions.

But wonder why it doesn't support fully connected layer in the same manner, even though we can define it by using Param or else.

Transfer Learning for DagNN

Hi again,
Is there any easy way for fine-tuning GoogleNet in AutoNN?
Can I use "deepCopy" or "replace" methods?
or should I use firstly dagnn structure to generate new nn and convert it to AutoNN?

Architecture diagram print

Hi
How to display the diagram of the architecture (or graph topology plot) in AutoNN which is done by Dagnn.print() or vl_simplenn_display()?
Best Regards

Insert Layers

Hi

Sorry for being so lame, but I am migrating from SimpleNN directly to AutoNN and don't have much knowledge of objected oriented programming either.

With a compiled autoNN network, is it possible to insert layers, such as adding dropouts, or removing pooling layers etc?

If not, then is the only way around is to recreate another network (using Layer.create() ) with parameters copied from original one where required?

Best Regards
Wajahat

adding lstm layer

I'm working on project using rnn. I would like to add new lstm layer type many to one. Can anyone suggest a solution?

An error occurred when using the power operator in GPU mode.

I had trouble with the power operator, so I ask you to help.

Imagine a quadratic function regression of y=a*x.^b
What I wanted to do was finding a and b parameter values with known x, y data.
So I declared a and b as Param objects, and formulated loss function as the form of MSE.

I ran the code below:

clear all; clc; close all;
x_true = [-3:7e-5:3]';
y_true = 3*x_true.^2;
x = Input('name','x');
y = Input('name','y');
a = Param('value', 1); a.diagnostics = true;
b = Param('value', 1); b.diagnostics = true;
y_hat = a*x.^b;
loss = sum((y-y_hat).^2);

Layer.workspaceNames();
net = Net(loss);
net.useGpu(1);
stats = Stats({'loss'});
solver = solvers.Adam();
solver.learningRate = 1e-2;

for i = 1:1000
    net.eval({'x', x_true, 'y', y_true});
    solver.step(net);
    plotDiagnostics(net,100)
    stats.update(net);
    stats.print();
    stats.push('train');
end

And I got the error message like below:

Error using gpuArray/log
LOG: needs to return a complex result, but this is not supported for real input X on the GPU. Use LOG(COMPLEX(X)) instead.

Of course, I tried the log(complex(arg)) but it seems like AutoNN doesn't support complex() operator.

Autonn to Dagnn

Hi
How to convert a saved network from AutoNN to DagNN?
Regards

weighted multi loss derivative

Hi,
In autonn, can we determine weigthed multi loss like dagnn :
derOutputs ={'objective1', 0.3,'objective2',0.3,'objective3',1}
or, only define how many loss function in eval by using derOutput?

If we can assign derOutput as derOutput = [.2 .5 .8] , how can we sort in true order?

External Pixel Weight as Input()

Hi,
I'm trying to use autoNN for segmentation. When I use pixel weight, I've got an error.
How can I add weights as an input? Thanks...

Loss Func:
Weigts_ = Input('W');
loss = sum( sum( Weigts_.*(imgs5 - label).^2 , 2) ,1) ./...
(size(imgs5,1)*size(imgs5,2));

Error :
Struct contents reference from a non-struct array object.
Error in Net/eval (line 69)
if numel(net.inputVarsInfo) < var || ~isequal(net.inputVarsInfo{var}.size,size(net.vars{var}))
Error in seg (line 125)
net.eval({'images', images, 'labels', labels,'W',W_mat }) ;

Memory efficiency

thanks Joao for this awesome repository :)

Autonn saves all variables in network evaluation, including those in layers which do not need the forward pass for derivative computation, like reshape, repmat, sum, etc. All layer derivatives are saved as well. So the convenience of not writing your derivatives by hand comes with a severe memory overhead.

There isn't any mechanics in place for aggressive deletion of intermediate variables, like in MatConvNet's dagnn package (although there is optimizeVars for vl_nnrelu only). I've submitted a pull request for deletion of derivatives during the backward pass, which only requires a few lines of code.

To implement the full memory savings of MatConvNet, we could also delete variables on the forward pass. I've taken this approach:

(1) add precious property to Layer
(2) add varsFanOut to Net, set varsFanOut during compilation
(3) during eval, update a copy of varsFanOut, then delete varsFanOut == 0 i.e. vars which are no longer need
(4) set 'precious' to false for layers like transpose or non-differentiable layers.

The code is here precious-layers . It saves memory for functions which do not need the forward input at all. non-differentiable layers (numInputDer = 0) are made non-precious by default during compilation.

I've ran into the following issue: the commonly used functions repmat, mean, sum, reshape, permute, ipermute, do not need the input variables on the backward pass for computation, but need their size, and for this reason have to be precious. The solution requires modifying the arguments of the backward functions during forward evaluation, which is messy.

Any ideas? Is there a better approach?

issues in the nutshell example

Hi,

I am new to autonn and looking into the demo, and there might be a problem in the example.
By modifying the definition of prediction to :
prediction = w * x' + b ;
instead of

prediction = w * x + b ;

( and change the def of w to w = Param('value', randn(100,1)) ;)

I will be able to run the demo (but then this w*x' produces a square matrix and is nonsense), otherwise, the exception in sum_der (line 229) always occurs.

For the original definition of x and w, the x_sz of line 220 in autonn_der.m is vector [1,1] with only two input arguments, and thus the line 224 would be
dim = find([1,1, 2] ~= 1, 1) ; // which returns 3

Then in line 229,
x_sz(dim) becomes x_sz(3) and it exceeds the max dimension of x_sz.

  • The current version of matconvnet is 1.0 -beta 24, while the version of matlab is R2016b.

Layer/vl_nnconvt works incorrectly when using the 'size' parameter

The help page of vl_nnconvt writes:

F is a SINGLE array of dimension FW x FH x K x FD where (FH,FW)
are the filter height and width, K the number of filters in the
bank, and FD the depth of a filter (the same as the depth of
image X). Filter k is givenby elements F(:,:,k,:); this differ
from VL_NNCONV() where a filter is given by elements
F(:,:,:,k). FD must be the same as the input depth D.

Therefore, when trying to create a convolution transpose layer to go from a layer with (for example) 1024 feature channels to 512 channels by supplying the 'size' parameter, you would do:

deconv1 = vl_nnconvt(relu10, 'size', [2, 2, 512, 1024]);

However, while this creates a filter bank of the correct size, it ends up creating the bias array with a size of 1024x1. This naturally causes the following error:

Error using vl_nnconvt
The number of elements of BIASES is not the same as the dimenison of the filters.

This is presumably caused by the discrepancy between the order of parameters of the methods vl_nnconv and vl_nnconvt (which is emphasized in the help page of vl_nnconvt), looking at the way Layer/vl_nnconvt is written:

% simply create a conv layer first, then switch the function handle

layer = vl_nnconv(varargin{:}) ;
layer.func = @vl_nnconvt ;

Operator overloading of the function "gradient"

Hi,

Is there any way to overload the MATLAB's built-in function "gradient?"

I tried Layer.create and Layer.fromFunction first, but it seems that we cannot use them for this purpose.
I inserted a debug stop mark into the custom function I made to check what's going on inside the function.
And I found that the input argument was fed as a Layer object, not as an array.
So, I guess those customization methods are designed to use the basic overloaded operators only.

The problem is that there are operations extracting elements of arrays inside 'gradient.m', which are currently not supported by AutoNN.

Way to use a network object in parallel

Hi,

I'm curious about if there is a way to utilize a Net object in parallel.

I tried parfor and parfeval with parallel.pool.Constant but it seems does not work.

If you have tried it before and have some example code, please let me know.

I would really appreciate that.

Best regards,

Overloading the sampling functions of probability distributions

Hi,

I'm trying to overload some sampling functions of probability distributions according to Jankowiak(2018) http://arxiv.org/abs/1806.01851

Fortunately, Matlab already supports functions like gamrnd so what I have to do is just overloading those functions via modifying Layer.m and autonn_der.m

However, as you can see in the equation (58) of the paper, it requires to get the original output value of the overloaded function, which is represented as z in the paper. It seems like that all the functions in autonn_der.m are currently having a form of dx = function(x, dy). I don't know how to pass the output value (y) and change it to have a form of dx = function(x, y, dy)

Can you give me a hint about how I can give the output value in the backpropagation stage?

Eval in test mode gives constant error value 0.5

Hi again, I'm sorry to keep you busy
Firstly, I have done the dagg2autonn translating work with the information you provided. Thank you for that. Now I'm working with resnet. training is well but in test mode error gives "0.5" constant value for every iteration. When I use forward mode, testing process works... Where is the mistake in test mode?
And also in test mode when I use training data for test, I get same results...

net.eval({'data', images, 'label', labels}, 'test') ;
net.eval({'data', images, 'label', labels}, 'forward') ;
loss1 : objective
loss2 : error

adsiz

Multi Loss Backpropagation Error

Hi again,
I'm writing a network which has two loss function. I used pretrained resnet-50. Here I diverted the single feature vector to two loss functions. This is my last layers code:
image

So I've inserted derOutput like that :
image

When I evaluate net, I get below error :

image

Is this error due to the coincidence on the single attribute vector of the two backward derivations?

Debug architecture filter size mismatch

I have created an architecture with autoNN to train.
There is some mismatch between the filter sizes. Therefore, I get the following error:


_### FILTERS are larger than the DATA (including padding).

Error in Net/eval (line 95)
[out{:}] = layer.func(args{:}) ;

Error in cnn_train_autonn>processEpoch (line 329)
net.eval(inputs, 'normal', params.derOutputs, s ~= 1) ;_


How can I find out where in the network this mismatch is?

cann't run the cnn example and debug the code

Hello,I have the follow conifguration
1、Windows 7 64bit
2、Matlab 2017b
3、Visual studio 2015
4、MatConvNet beta 25
5、Cuda 8.0.61
no cudnn in Matconvnet(cann't compile so give up)

first I use the matconvnet in the official website.I can run autonn/examples/cnn/mnist_example. but when I debug the code.there is an error MATLAB has encountered an internal problem and needs to close.I think may be I use a update version it may be right.So I use the github newest version.And now I cann't run autonn/examples/cnn/mnist_example,and it direct to the error MATLAB has encountered an internal problem and needs to close.Thanks.

plotPDF function example

Hello;

Could you please give me an example of using plotPDF?
More clearly, what's the obj argument?
I've got the following error with passing a compiled Net to the plotPDF function:

No appropriate method, property, or field 'plotPDF' for class 'Net'.

Thanks.

Class Center Loss Definition

Hi again, I'm here with a new question passing only a few day :)

I want to define a last logit layer as a class center loss like exp(-||x-mu1||^2) (in first, exp is not important so I didnt include it) I have 4k class and wrote a code :
net_auto = Layer.fromDagNN(net_dagx);
W = Param('value',randn(1,1,4096,4000,'single') ,'learningRate',2);
net_1 = sum( (W-net_auto{1}).^2,3);

when I looked inside vl_wsum, I saw this sizes and got an error message
image

first size is for W and second is mini batch data. I thought that autonn processes data one by one from mini batch so I wrote code every time in this idea. Then tried for-loop :
for i = 1:40
net_1a{i} = sum( (W-net_auto{1}(1,1,:,i)).^2,3);
end
net_1 = cat(4,net_1a{:});

vl_wsum loss can be passed, but in this case, vl_nnloss gives error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.