capsule_text_classification's Issues
hd5 file for other datasets, such as MR
Could you please provide hdf5 file for other datasets? I find it needs large memory to get hdf5 file for MR dataset.
For a single label task, how do you handle the output of the model ?
您好,在您的代码中在多标签数据上实验,模型输出的胶囊向量模大于0.5的标签设置为1,这样的设置很显然不适用于单标签的任务,我想知道对于单标签任务您是如何设置输出的?非常期待您的回答,感谢!
代码一些参数和论文对不上
对于capsule A 模型
1 N-gram convolutional Layer 卷积步长 论文说是1 但是代码是2
2 primary layer 关于C的值,代码中是32 但是我的输出为什么是prim poses dimension:(25, 99, 1, 16, 16)
C是16吗?
pytorch实现的代码
你好 想问一下有没有用pytorch实现的代码?
Orphan Category
在一些任务上,直接跑capsnet,相比于textcnn效果会差一些,考虑到background-noise的影响,您提出了3种策略,包括Orphan Category,Leaky-Softmax,和Coefficients Amendment。代码中好像只有Coefficients Amendment部分代码。请问其他两种方法的代码您还会更新上来吗?
文本分类相关问题
您好,我发现了一些问,运行您的代码的时候出现了一些,发现维度的错误(你的代码模型部分我都没有改变)在routing 部分的b=b+K.batch_dot(outputs,u_hat_vecs,[2,3])计算它的耦合系数的时候,其中output维度是[224,16,16,16],u_hat_ves的维度是[226,16,48,16]报错是: Dimensions must be equal, but are 224 and 16 for 'capsule_3/conv2/add_2' (op: 'Add') with input shapes: [224,16,48], [224,16,16,16,48],第二个是:胶囊网络的动态路由迭代是一个迭代过程,您好像没有进行反向传播截断,这个地方是否需要进行反向传播截断.
我使用tf的矩阵相乘[224,16,16,16] x[224,16,16,48]得到[224,16,16,48] 我然后在第2个维度上进行相加变成[224,16,48]与加b,解决了routing 部分的问题,然后后面的poses的reshape 又出现了相关问题,
我使用的环境是python3.6和tf.1.14.0,这个应该会是环境配置问题吧,像请问下相关的问题,想拿你的模型做一个baseline 模型,作为引文,你的代码是应该没有写笔误吧,还是python3.6和python2.的问题
About MR dataset
when I run your code in MR dataset, I found can't get results as your paper.please tell how you set the experiment and how to process the original data
数据在哪里获得
请问数据在哪里
What is leaky-softmax
hello, I have read your paper, but I do not understand leaky-softmax.
Can you give me equation, thanks !
The format of the input data
Would please tell us what is the format of the input data, i.e., how to use your code on user's own data? Thank you very much.
main.py: error: unrecognized arguments: -- model_type CNN --learning_rate 0.0005
python ./main.py --model_type CNN --learning_rate 0.0005
python ./main.py --model_type capsule-A --learning_rate 0.001
后来发现 --model_type 前面多了空格。
How do you split the dataset in the paper?
Can you detailly explain your split method of all datasets in the paper, or provide them? Some dataset don't have test-set or validation-set.
The loss don't change
Hello, after I change the weight_sharing from true to false. the loss don't change until 100 iterations. Then the model work properly, but the decaying rapid of the loss get quite slow. Can you give me some suggestions? I believe the key issues lays in the Squash function. but i don't know how to amend it.
Thank you!
为什么我得不到你论文里面的结果?
为什么我得不到你论文里面的结果?而且最终结果一直不会收敛,你是取得最大值作为最终结果吗?或者是你修改了一些超参数,可以交流一下吗?谢谢
ValueError: num_outputs should be int or long, got 9.
Hello
i need help
Traceback (most recent call last):
File "./main.py", line 167, in
poses, activations = baseline_model_cnn(X_embedding, args.num_classes)
File "D:\PycharmProjects\pythonProject\capsule_text_classification-master\network.py", line 18, in baseline_model_cnn
activations = tf.sigmoid(slim.fully_connected(nets, num_classes, scope='final_layer', activation_fn=None))
File "D:\Anaconda\envs\py27\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "D:\Anaconda\envs\py27\lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py", line 1822, in fully_connected
(num_outputs,))
ValueError: num_outputs should be int or long, got 9.
got an error
Thank you for sharing, but when I run your code, there is an error: ValueError: Dimensions must be equal, but are 84840 and 16 for 'capsule_3/conv2/add_2' (op: 'Add') with input shapes : [84840,16,48], [84840,16,16,16,48]. I changed the input data, but nothing else changed. Could you give me some suggestions?
代码问题
数据集
你好,其他的数据集可以共享下吗
the dataset is valid!
your dataset link is invalid,can you fix it and give the dataset link?
capsule-B F1 85.8?
In the paper, the capsule-B F1 score on Reuters-Multilabel data set is 85.8, but the best score I can get is 83.7
python ./main.py -- model_type capsule-A --learning_rate 0.001
Epoch: 2 Val accuracy: 89.9% Loss: 0.0612
ER: 0.095 Precision: 0.635 Recall: 0.575 F1: 0.566
Epoch: 3 Val accuracy: 93.3% Loss: 0.0391
ER: 0.594 Precision: 0.912 Recall: 0.770 F1: 0.816
Epoch: 4 Val accuracy: 94.7% Loss: 0.0326
ER: 0.615 Precision: 0.939 Recall: 0.788 F1: 0.837
Epoch: 5 Val accuracy: 95.8% Loss: 0.0299
ER: 0.428 Precision: 0.948 Recall: 0.692 F1: 0.777
Epoch: 6 Val accuracy: 96.0% Loss: 0.0272
ER: 0.348 Precision: 0.958 Recall: 0.661 F1: 0.759
next update
may I ask you when the experiment will update,please
"Coefficients Amendmen" strategy isn't implement in code
Hi,@andyweizhao :
I found that "Coefficients Amendmen" strategy isn't implement in code.It have been commented out。
"Coefficients Amendmen" strategy can't be improve the performance, so commented out it, isn't it?
Library requirements
Please list your requirements in a working environment by running pip freeze
. I tried with the following but I'm getting theano.tensor.var.AsTensorError: ('Cannot convert Tensor("capsule_3/primary/Reshape:0", shape=(25, 99, 1, 16, 16), dtype=float32) to TensorType', <class 'tensorflow.python.framework.ops.Tensor'>)
:
absl-py==0.6.0
astor==0.7.1
backports.weakref==1.0.post1
bleach==1.5.0
enum34==1.1.6
funcsigs==1.0.2
futures==3.2.0
gast==0.2.0
grpcio==1.16.0
h5py==2.8.0
html5lib==0.9999999
Keras==2.2.4
Keras-Applications==1.0.6
Keras-Preprocessing==1.0.5
Markdown==3.0.1
mock==2.0.0
numpy==1.15.3
pbr==5.1.0
protobuf==3.6.1
PyYAML==3.13
scikit-learn==0.17.1
scipy==1.1.0
six==1.11.0
tensorboard==1.11.0
tensorflow==1.4.1
tensorflow-tensorboard==0.4.0
termcolor==1.1.0
Theano==0.8.0
Werkzeug==0.14.1
layer.py 101行 的 激活函数怎么理解?
感谢您的分享,我在学习代码时有一处不理解,如下:
beta_a = _get_weights_wrapper(
name='beta_a', shape=[1, shape[-1]]
)
activations = K.sqrt(K.sum(K.square(poses), axis=-1)) + beta_a
我理解activations指的是vector的强度, 那beta_a是一个随机生成的变量,为什么要加在activations中呢?
还望请您有时间指点一下~
py-2 to py-3, int(num_classes)
capsule_text_classification/network.py
Line 18 in f62ba4b
About CapsuleConv, FullyConnected layers.
Hello,
I was interested to read the paper.
I would like to clarify the following.
Even though
vec_transformationByMat supposed to be used in the layers according to the paper, in the code vec_transformationByConv is applied
instead. It seems that vec_trandformationByMat is never used in the code.
Thanks in advance.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.