Hi, One of the use case of matrix and tensor factorization is in movie recommendation

You're welcome to take a crack at it <a class="user-mention notranslate" data-hovercar

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Handling missing data in decomposition about tensorly HOT 12 CLOSED

nipunbatra commented on May 22, 2024

Handling missing data in decomposition

from tensorly.

Comments (12)

ahwillia commented on May 22, 2024 6

Just FYI - I have (scipy/numpy) code that handles this (see link below). Agree it would be a nice addition to tensorly!

https://github.com/ahwillia/tensortools/blob/master/tensortools/least_squares.py

from tensorly.

JeanKossaifi commented on May 22, 2024 2

Agreed, let's make that happen!

Robust tensor PCA in TensorLy already handles missing values, ideally this should be the case for all decompositions.

from tensorly.

ShivangiM commented on May 22, 2024 1

Hi! Has it been worked on? If not I would like to start working on it

from tensorly.

JeanKossaifi commented on May 22, 2024

Nice posts. Would you like to create a pull request to add support for missing values?

Currently the Robust Tensor PCA handles missing values but Tucker and CP don't.
There are a number of things that we want to add, like having the option of choosing the solver as @ahwillia suggested.

from tensorly.

nipunbatra commented on May 22, 2024

I'd love to do a PR. Sadly, at this point of time, I don't know where the masking would need to be done in the code. Would you have a clue? If the factorisation can be reduced to least squares, this should be trivial.

from tensorly.

JeanKossaifi commented on May 22, 2024

You're welcome to take a crack at it @ShivangiM!

from tensorly.

JeanKossaifi commented on May 22, 2024

Hi @ShivangiM, any luck with this?

from tensorly.

ShivangiM commented on May 22, 2024

@JeanKossaifi not yet, been busy lately.

from tensorly.

jkjk82 commented on May 22, 2024

Hi all

I am wondering if robust_pca has been implemented as intended in terms of handling the missing values.

I understand the requested format for the missing value mask, but it seems that in the underlying data array X, the missing data cannot be a nan value, so it seems that you have to use a numerical value for missing data points in X.

However, I have noticed that the results are sensitive to the particular numerical value used for missing points, which I think cannot be the intended behavior? Is there an assumed value that missing points must have?

Sorry for lack of code, am on a mobile as not allowed GitHub at work!

from tensorly.

milanlanlan commented on May 22, 2024

Hi, I am trying to use CP for decomposition on my experiment (a part of data missing), and I notice that function parafac provides parameter "mask" for handling the missing values.

I am successful when the tensor is 2-dimensions, e.g. tl.tensor([[1., 2.], [3., 4.]]), and the mask array is same with the tensor while its value is 0/1.

However, when I repeat this process when the tensor is in higher-dimension, e.g. tl.tensor([ [[1., 2.], [3., 4.]], [[5.,6.],[7.,8.]] ]), it doesn't work.

The error is as followed:

Traceback (most recent call last):
  File "t.py", line 40, in <module>
    factors = parafac(X, rank=2, mask=kk)
  File "/home/a/Documents/tensorly/tensorly/decomposition/candecomp_parafac.py", line 185, in parafac
    tensor = tensor*mask + tl.kruskal_to_tensor((None, factors), mask=1-mask)
  File "/home/a/Documents/tensorly/tensorly/kruskal_tensor.py", line 188, in kruskal_to_tensor
    full_tensor = T.sum(khatri_rao([factors[0]*weights]+factors[1:], mask=mask), axis=1)
  File "/home/a/Documents/tensorly/tensorly/tenalg/_khatri_rao.py", line 98, in khatri_rao
    return T.kr(matrices, weights=weights, mask=mask)
  File "/home/a/Documents/tensorly/tensorly/backend/__init__.py", line 160, in inner
    return _get_backend_method(name)(*args, **kwargs)
  File "/home/a/Documents/tensorly/tensorly/backend/numpy_backend.py", line 69, in kr
    return np.einsum(operation, *matrices).reshape((-1, n_columns))*mask
ValueError: operands could not be broadcast together with shapes (16,2) (2,2,2,2)

It seems that the direct problem is in
np.einsum(operation, *matrices).reshape((-1, n_columns))*mask.
I doubt that it is because matrix multi need to use np.dot() and change this code into
np.dot(np.einsum(operation, *matrices).reshape((-1, n_columns)),mask)
and the problem becomes:

Traceback (most recent call last):
  File "t.py", line 40, in <module>
    factors = parafac(X, rank=2, mask=kk)
  File "/home/a/Documents/tensorly/tensorly/decomposition/candecomp_parafac.py", line 185, in parafac
    tensor = tensor*mask + tl.kruskal_to_tensor((None, factors), mask=1-mask)
  File "/home/a/Documents/tensorly/tensorly/kruskal_tensor.py", line 190, in kruskal_to_tensor
    return fold(full_tensor, 0, shape)
  File "/home/a/Documents/tensorly/tensorly/base.py", line 77, in fold
    return T.moveaxis(T.reshape(unfolded_tensor, full_shape), 0, mode)
  File "/home/a/Documents/tensorly/tensorly/backend/__init__.py", line 160, in inner
    return _get_backend_method(name)(*args, **kwargs)
  File "<__array_function__ internals>", line 6, in reshape
  File "/home/a/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 301, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/home/a/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
    return bound(*args, **kwds)
ValueError: cannot reshape array of size 64 into shape (2,2,2,2)

I don't really understand it indeed. What should I do for my goal? I am very hopeful for help and it is better if there is example in code. Thanks very much for anyone's suggestions.

from tensorly.

asmeurer commented on May 22, 2024

@milanlanlan I would suggest opening a separate issue report for this. It looks like mask needs to be reshaped, or else multiplied before the reshape (I'm not sure which).

from tensorly.

JeanKossaifi commented on May 22, 2024

Fixed by #173

from tensorly.

Handling missing data in decomposition about tensorly HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent