Giter Club home page Giter Club logo

capslayer's Introduction

CapsLayer: An advanced library for capsule theory

Capsule theory is a potential research proposed by Geoffrey E. Hinton et al, where he describes the shortcomings of the Convolutional Neural Networks and how Capsules could potentially circumvent these problems such as "pixel attack" and create more robust Neural Network Architecture based on Capsules Layer.

We expect that this theory will definitely contribute to Deep Learning Industry and we are excited about it. For the same reason we are proud to introduce CapsLayer, an advanced library for the Capsule Theory, integrating capsule-relevant technologies, providing relevant analysis tools, developing related application examples, and probably most important thing: promoting the development of capsule theory.

This library is based on Tensorflow and has a similar API with it but designed for capsule layers/models.

Features

If you want us to support more features, let us know by opening Issues or sending E-mail to [email protected]

Documentation

Contributions

Feel free to send your pull request or open issues

Citation

If you find it is useful, please cite our project by the following BibTex entry:

@misc{HuadongLiao2017,
title = {CapsLayer: An advanced library for capsule theory},
author = {Huadong Liao, Jiawei He},
year = {2017}
publisher = {GitHub},
journal = {GitHub Project},
howpublished = {\url{http://naturomics.com/CapsLayer}},
}

Note: We are considering to write a paper for this project, but before that, please cite the above Bibtex entry if you find it helps.

License

Apache 2.0 license.

capslayer's People

Contributors

naturomics avatar noahdragon avatar rhrahul avatar scottwedge avatar tarrysingh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

capslayer's Issues

Performance issues in capslayer/data/datasets/cifar10/reader.py

Hello,I found a performance issue in the definition of __call__(self, batch_size, mode) ,
capslayer/data/datasets/cifar10/reader.py,
dataset = dataset.map(parse_fun) was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.

The same issues also exist in dataset = dataset.map(parse_fun) ,
dataset = dataset.map(parse_fun),
dataset = dataset.map(parse_fun)

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Parameter selection

Hello, take up some time with you? Thank you for the capsule network project. At the moment ,I have fed my data to the system, but the accuracy is only 60%. The data is a 180*180 grayscale image. Only sorting( the binary classification ), do not need to rebuild. What parameters do I need to focus on? Adjust these parameters to improve accuracy, thanks

ValueError when using cl.layers.conv2d

I am trying to build my own capsule network using this library, but receive a value error from tf.get_variable() in the transforming() function called within the cl.layers.conv2d layer.

Since the variable scope is set within the transforming function I don't think this is something I can fix without poking around in the source code (but happy to be corrected if it is something I have done wrong).

When running:

class basicCapsNet(object):
...
with tf.variable_scope('Conv_Caps'):
net, activation = cl.layers.conv2d(inputs=net,
activation=activation,
filters=1,
out_caps_dims=[16,1],
kernel_size=1,
strides=(1,1),
padding="valid",
routing_method="DynamicRouting",
reuse=reuse)
self.conv_caps = (net,activation)
self.voice_mask = net
...
model = basicCapsNet(mixed_mag, voice_mag, is_training=False)

I get an error:

ValueError: Variable basic_caps_net/Conv_Caps/conv2d/transforming/transformation_matrix does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

CapsLayer Neural Network Example

Hi naturomics,
Thank you for Capslayer API. But there are some things I don't understand within the API. For example, do we use a pooling layer when creating a neural network? Have you made a sample CapsLayer neural network for the MNIST dataset? If you did, would you share such a network with me? Please help me start where. Thanks...

cifar10 dataset "Maximum allowed size exceeded" Error

Hi,
According to the tutorials I made the arrangements for the cifar10 data set. I'm using MatrixCapsNet as a model. Then I started training. After getting up to step 499, I get the following error (I decreased the batch size up to 1, but the result is the same). What is the solution to this?

step: 491, loss: 0.425, time: 0.034 sec/step
step: 492, loss: 0.425, time: 0.034 sec/step
step: 493, loss: 0.427, time: 0.033 sec/step
step: 494, loss: 0.425, time: 0.034 sec/step
step: 495, loss: 0.366, time: 0.033 sec/step
step: 496, loss: 0.433, time: 0.034 sec/step
step: 497, loss: 0.374, time: 0.033 sec/step
step: 498, loss: 0.358, time: 0.034 sec/step
step: 499, loss: 0.382, time: 0.033 sec/step
evaluating, it will take a while...
Traceback (most recent call last):
File "main.py", line 237, in
tf.app.run()
File "/home/atakan/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "main.py", line 231, in main
train(net, data_loader)
File "main.py", line 137, in train
plot_activation(np.hstack((probs, targets)), step=step, save_to=path)
File "/home/atakan/.local/lib/python3.6/site-packages/capslayer-0.1.5-py3.6.egg/capslayer/plotlib/figure.py", line 52, in plot_activation
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/pyplot.py", line 697, in savefig
res = fig.savefig(*args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/figure.py", line 1573, in savefig
self.canvas.print_figure(*args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/backends/backend_qt5agg.py", line 222, in print_figure
FigureCanvasAgg.print_figure(self, *args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/backend_bases.py", line 2252, in print_figure
**kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 545, in print_png
FigureCanvasAgg.draw(self)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 464, in draw
self.figure.draw(self.renderer)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/figure.py", line 1144, in draw
renderer, self, dsu, self.suppressComposite)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
a.draw(renderer)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 2426, in draw
mimage._draw_list_compositing_images(renderer, self, dsu)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
a.draw(renderer)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/axis.py", line 1136, in draw
ticks_to_draw = self._update_ticks(renderer)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/axis.py", line 969, in _update_ticks
tick_tups = [t for t in self.iter_ticks()]
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/axis.py", line 969, in
tick_tups = [t for t in self.iter_ticks()]
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/axis.py", line 912, in iter_ticks
majorLocs = self.major.locator()
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/ticker.py", line 1367, in call
return self.tick_values(dmin, dmax)
File "/home/atakan/anaconda3/lib/python3.6/site-packages/matplotlib/ticker.py", line 1371, in tick_values
np.arange(vmin + self.offset, vmax + 1, self._base))
ValueError: Maximum allowed size exceeded

Routing by agreement with Transformer-based for NMT

Hello all :)

I’m trying to use Routing by agreement with TRANSFORMER-BASED for NMT task. The proposed idea is to use each output of head attention as an input capsule for a capsule network to fuse the semantic and spatial information from different heads to help boost the correction of sentence output. As below:

routing

The implementation code is here, and Pytorch issue is here.

I have got so bad results. Kindly, I need and suggestion to work on.

I look forward to your feedback.

Support for MSCOCO dataset

How can we apply this API on the MSCOCO dataset?
In MNIST dataset, we have one label corresponding to each image but in MSCOCO, each image has multiple labels (categories) or multiple captions. How to modify the network in this case?

Prediction problem

Amazing work!
But, there are one strange thing that I can't figure out. I'm trying to train my own dataset - almost mnist, but slightly extended to 21 symbol. Records exactly like mnist, loader too, and so on, but, one random class never predicted (for example 7). I have no idea why. Outputs contains 21 labels, but 7'th label always has very very small values.
tydvdtwgtog

CapsNet on RGB 256*256 data

Can you tell us please how can we apply your network to classify RGB 256*256 data.

The structure of my datasets is:

Train:
Class 1:
0001.jpg
0002.jpg
0003.jpg
Class 2:
0001.jpg
0002.jpg
0003.jpg
Thank you so much in advanced.

In the provided MNIST example not all capsules are seeing activation probability

Hello naturomics! Thank you for this great code that is helping me understand capsule networks.
In the documentation i saw that all of the capsules are seeing activation, but when i ran the code 3 capsules were not activated.
Ran the code multiple times and saw that around 2-3-4 are not having any activation probability.
I've attached picture after 500 and 49500 steps and also your provided example activation chart.
Could you please help me solve this issue.

activation_500
activation_49500
results_mnist_vecCapsNetactivations

Error "Expected binary or unicode string, got None" While trying to predict model output

I am trying to use capslayer in Tensorflow (1.4.0) estimator API. Everything works well when I train the model. But for prediction it doesn't work. I have copied most of the code from vectorCapsNet.py file and commented all image reconstruction related code. My model definition looks like this:

def baseline(x, params, is_training):
  x = layers.batch_norm(x, is_training=is_training)
  conv1 = tf.contrib.layers.conv2d(x,
                                 num_outputs=256,
                                 kernel_size=9,
                                 stride=1, padding='VALID')
  primaryCaps, activation = capslayer.layers.primaryCaps(conv1,
                                                       filters=32,
                                                       kernel_size=9,
                                                       strides=2,
                                                       out_caps_shape=[8, 1])

  primaryCaps = tf.reshape(primaryCaps, shape=[params.batch_size, -1, 8, 1])

  digitCaps, activation = capslayer.layers.fully_connected(primaryCaps, 
                                                         activation, 
                                                         num_outputs=params.num_classes, 
                                                         out_caps_shape=[16, 1], 
                                                         routing_method='DynamicRouting')


  return digitCaps, activation

Error Description:
TypeError: Expected binary or unicode string, got None
During handling of the above exception, another exception occurred:
TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [None, 6, 6, 32, 8, 1]. Consider casting elements to a supported type.
(For second error it points to this line )

I am new to Tensorflow. But it seems I am getting [None, 6, 6] shaped output from conv1 layer defined in baseline function. If its supposed to be like that maybe we want to use this method for reshaping tensor?

Adding layer

Hey,
As you might remember, I opened an issue on your other capsnet repo. and asked about adding layers. I used this repo (Capslayer) and saw the same thing. Say just adding one fully connected layer (similar to digitcap but with different number of capsule, say 20). This is the only part of code that I'm changing:

return digitCaps: [batch_size, 20, 16, 1], activation: [batch_size, num_label]

with tf.variable_scope('DigitCaps_layer1'):
       primaryCaps = tf.reshape(primaryCaps, shape=[cfg.batch_size, -1, 8, 1])
       digitCaps1, activation = capslayer.layers.fully_connected(primaryCaps, prim_act, num_outputs=20, 
 out_caps_shape=[16, 1], routing_method='DynamicRouting')

return digitCaps: [batch_size, num_label, 16, 1], activation: [batch_size, num_label]

with tf.variable_scope('DigitCaps_layer2'):
       self.digitCaps, self.activation = capslayer.layers.fully_connected(digitCaps1, activation, 
num_outputs=10, out_caps_shape=[16, 1], routing_method='DynamicRouting')

The training and validation accuracy goes down to around 10%!
I thought maybe it's because the number of parameters which increases from 8 million to 10 million by adding this single layer

i have issue with cifar10

when I trained the dataset cifar10 the train accuracy result is upset ,the max accuracy is 65%,
I want to konw if i get the right accuracy?,

Performance issues in /capslayer/data/datasets (by P3)

Hello! I've found a performance issue in /capslayer/data/datasets: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • /cifar10/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /fashion_mnist/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /mnist/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /cifar100/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).

Besides, you need to check the function called in map()(e.g., parse_fun called in dataset.map(parse_fun)) whether to be affected or not to make the changed code work properly. For example, if parse_fun needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Hard coded DigitCaps num_outputs?

self.digitCaps, self.activation = capslayer.layers.fully_connected(primaryCaps, activation, num_outputs=10, out_caps_shape=[16, 1], routing_method='DynamicRouting')

Currently, the num_outputs DigitCaps fully connected layer is hard coded to 10, which I believe is to represent the number of logits found in MNIST, in consideration of model portability, should this perhaps be set to num_outputs=self.num_label?

While on the topic, I don't believe I understand the 16 in out_caps_shape=[16, 1], but I haven't really grokked the original paper either, is there some performance or accuracy benefit in twiddling with the out_caps_shape parameter?

Multi-label classificatoin

Thank you for this project. Is it possible to do a multi-label classification with CapsNet such that the softmax output can predict multiple classes for each input image?

Thanks so much,
Abby

If I use another dataset with different dimensions

Do I have to only change the dimension in line 124 of main.py? Is that the only dimension needs to be changed?

I'm trying to train this with the FER2013 face emotion dataset which has images of size 48x48. I load the dataset properly and I have changed the line 124 accordingly but I don't see a good trend in training like raising accuracy!

fashion mnist support

Hi,

first of all, thanks for this great project!

I noticed that in models/main.py (line 206) the fashion mnist dataset is called 'fashion-mnist'.

    # Deciding which dataset to use
    if cfg.dataset == 'mnist' or cfg.dataset == 'fashion-mnist':

But in capslayer/data/datasets/fashion_mnist it is called 'fashion_mnist', therefore main.py doesnt run with fashion mnist dataset.

Another thing:
in capslayer/data/datasets/fashion_mnist/writer.py MNIST_FILES should be FASHION_MNIST_FILES:

def load_fashion_mnist(path, split):
    split = split.lower()
    image_file, label_file = [os.path.join(path, file_name) for file_name in MNIST_FILES[split]]

Question for E_step in EM Routing

In the function E_step in the method CapsLayer-master/capslayer/core/routing.py

The Gaussian probability is calculated as following:
normalized_vote = cl.divide(tf.square(vote - pose), 2 * tf.exp(log_var))
log_probs = normalized_vote + cl.log(2 * np.pi) + log_var
log_probs = -0.5 * cl.reduce_sum(log_probs, axis=-1, keepdims=True)

since you are pulling out the -0.5 in the third line, in the first line, shoudn't we have tf.exp(log_var)) instead of 2 * tf.exp(log_var))?

Also in the following two lines:
log_activation_logit = log_activation + log_probs
log_activation_logit = log_probs

I think the second line should not be there.

I'm just starting to read about capsule net and I'm trying to match the implementation with the pseudocode in the paper so I could be wrong...

ModuleNotFoundError: No module named 'capslayer'

I cannot work out how to import and use the layers functions (for building my own model). Import CapsLayers works, so the module is installed and in the path, but if I try:

import CapsLayer.capslayer.layers.layers as cl #(or any other combination I can think of)

It results in:

ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 import CapsLayer.capslayer.layers.layers as cl

C:\Program Files\Anaconda3\lib\site-packages\CapsLayer\capslayer_init_.py in ()
1 from future import absolute_import
2
----> 3 from capslayer import layers
4 from capslayer import data
5 from capslayer.ops import losses

ModuleNotFoundError: No module named 'capslayer'

CpasNet on 227*227 data and 196 classes

I'm trying to train this with images that are 227*227 and have 196 types. I wrote my own load_data() but when training, error "Resource exhausted, OOM when allocating tensor with shape [128,3872,196,16,1]" occurs. (128 is batch size)

Variable batch size problem

I'm trying to integrate this implementation of capsule with RNN. I'm using the latest version of Tensorflow.

I'm getting errors like this:

Traceback (most recent call last):
  File "train.py", line 46, in <module>
    Inputs = CAPSULE_NET(x_expanded, phase_train, 'CAPSULE_NET_1')
  File "/home/user/Testing/DeepLearning/Systems/Experimental/Capsule/capsulenet.py", line 43, in CAPSULE_NET
    primaryCapsules, activation = primaryCaps(conv1, method='logistic', filters=32, kernel_size=9, strides=2, out_caps_shape=[8, 1])
  File "/home/user/Testing/DeepLearning/Systems/Experimental/Capsule/layers.py", line 91, in primaryCaps
    pose = tf.reshape(pose, shape=pose_shape)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3997, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 513, in _apply_op_helper
    raise err
TypeError: Failed to convert object of type <type 'list'> to Tensor. Contents: [None, 24, 24, 32, 8, 1]. Consider casting elements to a supported type.

This can be resolved by changing

pose_shape = pose.get_shape().as_list()[:3] + [filters] + out_caps_shape

to

pose_shape = np.array([-1] + pose.get_shape().as_list()[1:3] + [filters] + out_caps_shape, dtype=np.int32)

The tensor has the following shape: [None, 24, 24, 32, 8, 1]. The batch size is variable, this is why it is None. I tried fixing lots of these problems in the code related to the batch size being a None Tensorflow dimension or -1 as with numpy. I'm now stuck on some of these problems in the EM part.

What do you guys think can be done to properly support variable batch size?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.