Comments (12)
Just FYI - I have (scipy/numpy) code that handles this (see link below). Agree it would be a nice addition to tensorly!
https://github.com/ahwillia/tensortools/blob/master/tensortools/least_squares.py
from tensorly.
Agreed, let's make that happen!
Robust tensor PCA in TensorLy already handles missing values, ideally this should be the case for all decompositions.
from tensorly.
Hi! Has it been worked on? If not I would like to start working on it
from tensorly.
Nice posts. Would you like to create a pull request to add support for missing values?
Currently the Robust Tensor PCA handles missing values but Tucker and CP don't.
There are a number of things that we want to add, like having the option of choosing the solver as @ahwillia suggested.
from tensorly.
I'd love to do a PR. Sadly, at this point of time, I don't know where the masking would need to be done in the code. Would you have a clue? If the factorisation can be reduced to least squares, this should be trivial.
from tensorly.
You're welcome to take a crack at it @ShivangiM!
from tensorly.
Hi @ShivangiM, any luck with this?
from tensorly.
@JeanKossaifi not yet, been busy lately.
from tensorly.
Hi all
I am wondering if robust_pca has been implemented as intended in terms of handling the missing values.
I understand the requested format for the missing value mask, but it seems that in the underlying data array X, the missing data cannot be a nan value, so it seems that you have to use a numerical value for missing data points in X.
However, I have noticed that the results are sensitive to the particular numerical value used for missing points, which I think cannot be the intended behavior? Is there an assumed value that missing points must have?
Sorry for lack of code, am on a mobile as not allowed GitHub at work!
from tensorly.
Hi, I am trying to use CP for decomposition on my experiment (a part of data missing), and I notice that function parafac provides parameter "mask" for handling the missing values.
I am successful when the tensor is 2-dimensions, e.g. tl.tensor([[1., 2.], [3., 4.]]), and the mask array is same with the tensor while its value is 0/1.
However, when I repeat this process when the tensor is in higher-dimension, e.g. tl.tensor([ [[1., 2.], [3., 4.]], [[5.,6.],[7.,8.]] ]), it doesn't work.
The error is as followed:
Traceback (most recent call last):
File "t.py", line 40, in <module>
factors = parafac(X, rank=2, mask=kk)
File "/home/a/Documents/tensorly/tensorly/decomposition/candecomp_parafac.py", line 185, in parafac
tensor = tensor*mask + tl.kruskal_to_tensor((None, factors), mask=1-mask)
File "/home/a/Documents/tensorly/tensorly/kruskal_tensor.py", line 188, in kruskal_to_tensor
full_tensor = T.sum(khatri_rao([factors[0]*weights]+factors[1:], mask=mask), axis=1)
File "/home/a/Documents/tensorly/tensorly/tenalg/_khatri_rao.py", line 98, in khatri_rao
return T.kr(matrices, weights=weights, mask=mask)
File "/home/a/Documents/tensorly/tensorly/backend/__init__.py", line 160, in inner
return _get_backend_method(name)(*args, **kwargs)
File "/home/a/Documents/tensorly/tensorly/backend/numpy_backend.py", line 69, in kr
return np.einsum(operation, *matrices).reshape((-1, n_columns))*mask
ValueError: operands could not be broadcast together with shapes (16,2) (2,2,2,2)
It seems that the direct problem is in
np.einsum(operation, *matrices).reshape((-1, n_columns))*mask
.
I doubt that it is because matrix multi need to use np.dot() and change this code into
np.dot(np.einsum(operation, *matrices).reshape((-1, n_columns)),mask)
and the problem becomes:
Traceback (most recent call last):
File "t.py", line 40, in <module>
factors = parafac(X, rank=2, mask=kk)
File "/home/a/Documents/tensorly/tensorly/decomposition/candecomp_parafac.py", line 185, in parafac
tensor = tensor*mask + tl.kruskal_to_tensor((None, factors), mask=1-mask)
File "/home/a/Documents/tensorly/tensorly/kruskal_tensor.py", line 190, in kruskal_to_tensor
return fold(full_tensor, 0, shape)
File "/home/a/Documents/tensorly/tensorly/base.py", line 77, in fold
return T.moveaxis(T.reshape(unfolded_tensor, full_shape), 0, mode)
File "/home/a/Documents/tensorly/tensorly/backend/__init__.py", line 160, in inner
return _get_backend_method(name)(*args, **kwargs)
File "<__array_function__ internals>", line 6, in reshape
File "/home/a/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 301, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/home/a/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
return bound(*args, **kwds)
ValueError: cannot reshape array of size 64 into shape (2,2,2,2)
I don't really understand it indeed. What should I do for my goal? I am very hopeful for help and it is better if there is example in code. Thanks very much for anyone's suggestions.
from tensorly.
@milanlanlan I would suggest opening a separate issue report for this. It looks like mask
needs to be reshaped, or else multiplied before the reshape (I'm not sure which).
from tensorly.
Fixed by #173
from tensorly.
Related Issues (20)
- All nan in matrix come from non negative tucker decomposition HOT 2
- Init mode == "random" does not return the correct shape in initialize_tucker HOT 3
- It appears that partial_unfold works using sparse tensors, but it is not clear in the documentation
- Better random init of factorized tensors HOT 1
- svd_interface will throw an error if the number of rows of the matrix is smaller than it's columns HOT 1
- numpy.core._exceptions._ArrayMemoryError HOT 2
- Is there any t-product implementation code in tensorly?Thanks HOT 1
- More descriptive message when random PARAFAC2 rank is infeasible given shape HOT 1
- AssertionError: `tensorly.tt_tensor.validate_tt_rank` test HOT 1
- Randomised_CP function throws a Singular Matrix error HOT 2
- Tensor Conversion in TensorLy Does Not Preserve PyTorch Tensor dtype and device Attributes
- PARAFAC2 for missing data HOT 1
- Panel Dataset Time, Company, Feature HOT 1
- No attribute "device" when using Numpy backend HOT 3
- Remove MXNet from doc
- Parafac\parafac2 projection HOT 1
- Tensorly support float16/int8
- OOM issue HOT 6
- Question: CP decomposition of symmetric tensor with repeating factor matrix HOT 1
- How can I apply activation to each factor matrix? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorly.