Giter Club home page Giter Club logo

tt-pytorch's People

Contributors

alexgrinch avatar deepsourcebot avatar elena-orlova avatar geom-score avatar khrulkovv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tt-pytorch's Issues

Is there a way to control the compression ratio of tt_embedding?

TT embedding provides good a embedding compression ratio. However, I don't know how to control the compression rate by setting the shapes and TT-rank. My goal is to apply tt_embedding for CTR tasks. I have the following two featues and each has been embeded by tt_embedding. But the following setting has led to a large accuracy drop (3%). I wonder if I can only use a small compression rate, e.g., 2x~5x. How can best setting the shapes to achieve a target compression rate? Thanks!

voc_size=820508 emb_size=192
A TT-Matrix of size 900000 x 192, underlying tensorshape: [90, 100, 100] x [4, 6, 8], TT-ranks: [1, 160, 160, 1] on device 'cpu' with compression rate 11.12

voc_size=2903321 emb_size=192
A TT-Matrix of size 3000000 x 192, underlying tensorshape: [125, 150, 160] x [4, 6, 8], TT-ranks: [1, 160, 160, 1] on device 'cpu' with compression rate 24.69

`TTLayer` doesn't work with non-contiguous input matrix

Репорчу баг, доставляющий сложности с использованием твоего TTLayer в некоторых моделях (например в Tensor Train GRU).

<ipython-input-38-60c832c24e05> in forward(self, X)
     63             # this is also called d^{(t)}
     64             updated_hidden_state_value = torch.tanh(
---> 65                 self.input_to_updated_hidden_value(X_part)
     66                 + self.hidden_state_to_updated_hidden_value(self.hidden_state * self.reset_gate)
     67             )

~/soft/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~/soft/conda/lib/python3.6/site-packages/t3nsor/layers.py in forward(self, x)
    126             return t3.tt_dense_matmul(weight_t, x_t).transpose(0, 1)
    127         else:
--> 128             return t3.tt_dense_matmul(weight_t, x_t).transpose(0, 1) + self.bias

~/soft/conda/lib/python3.6/site-packages/t3nsor/ops.py in tt_dense_matmul(tt_matrix_a, matrix_b)
     77     # data is (K, j0, ..., jd-2) x jd-1 x 1
     78     data = matrix_b.transpose(0, 1)
---> 79     data = data.view(-1, a_raw_shape[1][-1], 1)
     80 
     81     for core_idx in reversed(range(ndims)):

RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /opt/conda/conda-bld/pytorch_1549628766161/work/aten/src/THC/generic/THCTensor.cpp:220

Я думаю, он проявляется вот почему:

В forward моей модели подается тензор X размера 32 x 85 x 1131, где 32 - кол-во сэмплов в бэтче, 85 - кол-во кадров (это видео), 1131 - пиксели.

Так вот, в TTLayer моя модель подает, например, X[:, 0, :], потому что сначала она обрабатывает нулевые элементы последовательностей в бэтче, потом первые, и т.д. Соответсвенно X[:, 0, :].is_contiguous() возвращает False.

meaning of shape parameter in to_tt_matrix?

I have been trying to use the tensor decomposition on a personal project from your to_tt_matrix and to_tt_tensor. However, I donot understand what the additional shape parameter in t3nsor.decompositions.to_tt_matrix() stand for.

Your comments in the code suggest it can also contain None values, which to me is quite puzzling. Would appreciate your answer on this.

In TTLayer weight_t must be declared as parameter, but it isn't

TTLayer.forward method uses only weight_t property, not weight. Also weight_t is not a reference to weight, it's a copy of weight, which is then transposed. Hence after training weight and weight_t have absolutely different values.

I suspect that either weight shouldn't be a torch parameter at all, or TensorTrain.transpose method should be changed so that it doesn't produce a copy.

Anyway, I attach a patch which fixes the problem of ttlayer.to(device) not correctly transferring it to that device. It's a somewhat dirty workaround, but still should be applied until a proper fix is made.

diff --git a/t3nsor/layers.py b/t3nsor/layers.py
index 65af3c3..c1346f1 100644
--- a/t3nsor/layers.py
+++ b/t3nsor/layers.py
@@ -110,8 +110,12 @@ class TTLinear(nn.Module):
 
         self.shape = shape
         self.weight = init.to_parameter()
-        self.parameters = self.weight.parameter
-        self.weight_t = t3.transpose(self.weight)
+        self.weight_parameter = self.weight.parameter
+        # actually weight probably shouldn't be assigned to self and
+        # shouldn't be a parameter because it doesn't participate in self.forward
+
+        self.weight_t = t3.transpose(self.weight).to_parameter()
+        self.weight_t_parameter = self.weight_t.parameter
 
         if bias:
             self.bias = torch.nn.Parameter(1e-2 * torch.ones(out_features))

High GPU memory consumption

Hi,
I tried to integrate the TTLayer into transformerXL,
however I found that it consumes much more memory than usual.
Did you experience such problems? do you know anyway around this?

(BTW I also applied few fixes for multi-GPU training, e.g tensor train objects are not passed to GPU when you activate the model.to(device), therefore breaking the model in distributed training).

svd_fix

[root@gpu-4 sentiment]# mkdir logdir
[root@gpu-4 sentiment]# python train.py --gpu=0 --embed_dim=256 --dataset=imdb --n_epochs=100
Traceback (most recent call last):
File "train.py", line 42, in
import t3nsor as t3
File "../t3nsor/init.py", line 7, in
from t3nsor.decompositions import to_tt_tensor
File "../t3nsor/decompositions.py", line 5, in
from t3nsor.utils import svd_fix
ImportError: cannot import name 'svd_fix'

About Accuracy

I test this:
python train.py --gpu=0 --embed_dim=256 --dataset=imdb --n_epochs=100
image
Is there a problem with the formula for calculating the correct rate?

tt_conv is not implemented here

I have read the paper, it is very interesting work, but I was thinking to use it with Conv layers, but the Conv layer is not implemented.

I have researched other Github repository, there is no PyTorch implementation of the Conv layer. Can you help me out about if there any code or some work that shows how can be implemented in PyTorch?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.