khrulkovv / tt-pytorch Goto Github PK

View Code? Open in Web Editor NEW

60.0 60.0 16.0 8.56 MB

Python 99.08% Shell 0.92%

tt-pytorch's People

Contributors

Stargazers

Watchers

Forkers

sunyancn fangliancheng onucharles xx-fighting aelphy iamweiweishi huayanliu elena-orlova eduryev ibulu ankitshah009 tripleess haoyu0408

tt-pytorch's Issues

initialize with pre-trained embeddings?

is there a way to initialize the TTEmbedding layer with a pre-trained embedding tensor?

Is there a way to control the compression ratio of tt_embedding?

TT embedding provides good a embedding compression ratio. However, I don't know how to control the compression rate by setting the shapes and TT-rank. My goal is to apply tt_embedding for CTR tasks. I have the following two featues and each has been embeded by tt_embedding. But the following setting has led to a large accuracy drop (3%). I wonder if I can only use a small compression rate, e.g., 2x~5x. How can best setting the shapes to achieve a target compression rate? Thanks!

voc_size=820508 emb_size=192
A TT-Matrix of size 900000 x 192, underlying tensorshape: [90, 100, 100] x [4, 6, 8], TT-ranks: [1, 160, 160, 1] on device 'cpu' with compression rate 11.12

voc_size=2903321 emb_size=192
A TT-Matrix of size 3000000 x 192, underlying tensorshape: [125, 150, 160] x [4, 6, 8], TT-ranks: [1, 160, 160, 1] on device 'cpu' with compression rate 24.69

`TTLayer` doesn't work with non-contiguous input matrix

Репорчу баг, доставляющий сложности с использованием твоего TTLayer в некоторых моделях (например в Tensor Train GRU).

<ipython-input-38-60c832c24e05> in forward(self, X)
     63             # this is also called d^{(t)}
     64             updated_hidden_state_value = torch.tanh(
---> 65                 self.input_to_updated_hidden_value(X_part)
     66                 + self.hidden_state_to_updated_hidden_value(self.hidden_state * self.reset_gate)
     67             )

~/soft/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~/soft/conda/lib/python3.6/site-packages/t3nsor/layers.py in forward(self, x)
    126             return t3.tt_dense_matmul(weight_t, x_t).transpose(0, 1)
    127         else:
--> 128             return t3.tt_dense_matmul(weight_t, x_t).transpose(0, 1) + self.bias

~/soft/conda/lib/python3.6/site-packages/t3nsor/ops.py in tt_dense_matmul(tt_matrix_a, matrix_b)
     77     # data is (K, j0, ..., jd-2) x jd-1 x 1
     78     data = matrix_b.transpose(0, 1)
---> 79     data = data.view(-1, a_raw_shape[1][-1], 1)
     80 
     81     for core_idx in reversed(range(ndims)):

RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /opt/conda/conda-bld/pytorch_1549628766161/work/aten/src/THC/generic/THCTensor.cpp:220

Я думаю, он проявляется вот почему:

В forward моей модели подается тензор X размера 32 x 85 x 1131, где 32 - кол-во сэмплов в бэтче, 85 - кол-во кадров (это видео), 1131 - пиксели.

Так вот, в TTLayer моя модель подает, например, X[:, 0, :], потому что сначала она обрабатывает нулевые элементы последовательностей в бэтче, потом первые, и т.д. Соответсвенно X[:, 0, :].is_contiguous() возвращает False.

meaning of shape parameter in to_tt_matrix?

I have been trying to use the tensor decomposition on a personal project from your to_tt_matrix and to_tt_tensor. However, I donot understand what the additional shape parameter in t3nsor.decompositions.to_tt_matrix() stand for.

Your comments in the code suggest it can also contain None values, which to me is quite puzzling. Would appreciate your answer on this.

In TTLayer weight_t must be declared as parameter, but it isn't

TTLayer.forward method uses only weight_t property, not weight. Also weight_t is not a reference to weight, it's a copy of weight, which is then transposed. Hence after training weight and weight_t have absolutely different values.

I suspect that either weight shouldn't be a torch parameter at all, or TensorTrain.transpose method should be changed so that it doesn't produce a copy.

Anyway, I attach a patch which fixes the problem of ttlayer.to(device) not correctly transferring it to that device. It's a somewhat dirty workaround, but still should be applied until a proper fix is made.

diff --git a/t3nsor/layers.py b/t3nsor/layers.py
index 65af3c3..c1346f1 100644
--- a/t3nsor/layers.py
+++ b/t3nsor/layers.py
@@ -110,8 +110,12 @@ class TTLinear(nn.Module):
 
         self.shape = shape
         self.weight = init.to_parameter()
-        self.parameters = self.weight.parameter
-        self.weight_t = t3.transpose(self.weight)
+        self.weight_parameter = self.weight.parameter
+        # actually weight probably shouldn't be assigned to self and
+        # shouldn't be a parameter because it doesn't participate in self.forward
+
+        self.weight_t = t3.transpose(self.weight).to_parameter()
+        self.weight_t_parameter = self.weight_t.parameter
 
         if bias:
             self.bias = torch.nn.Parameter(1e-2 * torch.ones(out_features))

High GPU memory consumption

Hi,
I tried to integrate the TTLayer into transformerXL,
however I found that it consumes much more memory than usual.
Did you experience such problems? do you know anyway around this?

(BTW I also applied few fixes for multi-GPU training, e.g tensor train objects are not passed to GPU when you activate the model.to(device), therefore breaking the model in distributed training).

svd_fix

[root@gpu-4 sentiment]# mkdir logdir
[root@gpu-4 sentiment]# python train.py --gpu=0 --embed_dim=256 --dataset=imdb --n_epochs=100
Traceback (most recent call last):
File "train.py", line 42, in
import t3nsor as t3
File "../t3nsor/init.py", line 7, in
from t3nsor.decompositions import to_tt_tensor
File "../t3nsor/decompositions.py", line 5, in
from t3nsor.utils import svd_fix
ImportError: cannot import name 'svd_fix'

About Accuracy

I test this:
python train.py --gpu=0 --embed_dim=256 --dataset=imdb --n_epochs=100

Is there a problem with the formula for calculating the correct rate?

tt_conv is not implemented here

I have read the paper, it is very interesting work, but I was thinking to use it with Conv layers, but the Conv layer is not implemented.

I have researched other Github repository, there is no PyTorch implementation of the Conv layer. Can you help me out about if there any code or some work that shows how can be implemented in PyTorch?

khrulkovv / tt-pytorch Goto Github PK

tt-pytorch's People

Contributors

Stargazers

Watchers

Forkers

tt-pytorch's Issues

initialize with pre-trained embeddings?

Is there a way to control the compression ratio of tt_embedding?

`TTLayer` doesn't work with non-contiguous input matrix

meaning of shape parameter in to_tt_matrix?

In TTLayer weight_t must be declared as parameter, but it isn't

High GPU memory consumption

svd_fix

About Accuracy

tt_conv is not implemented here

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent