Giter Club home page Giter Club logo

pyttb's Introduction

Copyright 2022 National Technology & Engineering Solutions of Sandia,
LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the
U.S. Government retains certain rights in this software.

Regression tests Coverage Status pypi package image Ruff Code style: black

pyttb: Python Tensor Toolbox

Welcome to pyttb, a set of Python classes and methods functions for manipulating dense, sparse, and structured tensors, along with algorithms for computing low-rank tensor models.

Tensor Classes:

  • tensor: dense tensors
  • sptensor: sparse tensors
  • ktensor: Kruskal tensors
  • tenmat: matricized tensors
  • ttensor: Tucker tensors

Tensor Algorithms:

  • cp_als, cp_apr: Canonical Polyadic (CP) decompositions
  • tucker_als: Tucker decompostions

Getting Started

For full details see our documentation.

Quick Start

We are on pypi

pip install pyttb

or install from source

pip install .

Contributing

Check out our contributing guide.

pyttb's People

Contributors

brian-kelley avatar deepblockdeepak avatar dmdunla avatar etphipp avatar jdtuck avatar jeremy-myers avatar kshitiz305 avatar ntjohnson1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pyttb's Issues

Explore integrating pre-commit into repo

For things like linting and formatting pre-commit hooks might be easier. Need to explore the ease of opting in or opting out of pre-commit setup and if that requires additional instructions.

Support general cores in TTENSOR

This added ttensor support for dense cores. MATLAB supports any core that follows an interface. We can probably use protocols and mypy to sanity check this. Alternatively may consider creating a base class for some shared pyttb tensor functionality.

Testing: implement tests for full coverage

If possible, implement tests to provide full coverage of TensorToolbox code:

pytest --cov=pyttb  tests/ --cov-report=term-missing
Name                   Stmts   Miss  Cover   Missing
----------------------------------------------------
pyttb\__init__.py         23      1    96%   31
pyttb\cp_als.py           87      0   100%
pyttb\cp_apr.py          567     78    86%   84, 86, 88, 110, 197-198, 229, 247-249, 342, 351, 374-384, 419-432, 464-469, 495, 536, 540-541, 649, 658, 681-691, 719-732, 781-786, 808, 813, 819, 871-872, 1016, 1034-1037, 1044-1045, 1230, 1297-1299, 1343
pyttb\export_data.py      63      1    98%   54
pyttb\import_data.py      60      4    93%   14, 24, 61, 72
pyttb\khatrirao.py        22      0   100%
pyttb\ktensor.py         468      5    99%   772, 812-816
pyttb\pyttb_utils.py     237      2    99%   293, 328
pyttb\sptenmat.py          4      0   100%
pyttb\sptensor.py        914      4    99%   117, 562, 608, 637
pyttb\sptensor3.py         4      0   100%
pyttb\sumtensor.py         4      0   100%
pyttb\symktensor.py        4      0   100%
pyttb\symtensor.py         4      0   100%
pyttb\tenmat.py          182      1    99%   183
pyttb\tensor.py          557      3    99%   1105, 1219, 1394
pyttb\ttensor.py           4      0   100%
----------------------------------------------------
TOTAL                   3204     99    97%

tensor: implement __radd__

__radd__ required to add tensors to non-tensors where the non-tensor is the first summand

>>> import TensorToolbox as ttb
>>> import numpy as np
>>> T = ttb.tensor.from_data(np.arange(2))
>>> T
tensor of shape 2
data[:] =
[0 1]

>>> T + 1
tensor of shape 2
data[:] =
[1 2]

>>> 1 + T
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'tensor'

Address ttm dims=0 ambiguity

Identified here #53 MATLAB is 1 indexed so -index represents all indices EXCEPT for the one listed. However since python is 0 index -0 is the same as 0 so there is an ambiguity. One proposal is tensor.ttm(dims=None, exclude_dims=None). Would need to see if this ambiguity is the same throughout other tensor implementations and should be updated uniformly/checked in algorithms that use the tensors.

python3.9 and mttkrp

Some people were unable to run the code below using Python3.9. Multiple people had this experience, probably with different variants of 3.9. One person said that he fixed the issue by reverting to an older version of numpy, though I didn't get which one.

import pyttb as ttb
import numpy as np


# 2D Example dense
# ---------------------------------------------------------------
print("--------------------------------------------------------")
print("DENSE EXAMPLE (high-level; we'll dig into MTTKRP later):")
print("--------------------------------------------------------")

weights = np.array([1,2])
fm0 = np.array([[3,3], [4,5]])
fm1 = np.array([[3,3], [4,5], [6,7]])

K = ttb.ktensor.from_data(weights, [fm0, fm1])

print(f'dense Kruskal tensor: {K.full()}')

print("--------------------------------------------------------")
dcmp, init, stats = ttb.cp_als(K, 2)
print("--------------------------------------------------------")

print(f'decomposition after cp_als: {dcmp}')

print(dcmp)

ValidateK = ttb.ktensor.from_data(dcmp.weights,dcmp.factor_matrices)

print("--------------------------------------------------------")
print(f'validation of solution: {ValidateK.full()}')
print("--------------------------------------------------------")


# same 2D Example sparse
# ----------------------------------------------------------------
print("--------------------------------------------------------")
print("SPARSE EXAMPLE (high-level; we'll dig into MTTKRP later):")
print("--------------------------------------------------------")
weights = np.array([1,2])
subs = np.array([[0,0], [0,1], [0,2], [1,0], [1,1], [1,2]])
vals = np.array([[27],[42],[60],[42],[66],[94]])
shape = (2,3)

spk = ttb.sptensor.from_data(subs, vals, shape)

print(spk.full())

sp_dcmp, sp_init, sp_stats = ttb.cp_als(spk, 2)

print(sp_dcmp)

sp_ValidateK = ttb.ktensor.from_data(sp_dcmp.weights,sp_dcmp.factor_matrices)

print(sp_ValidateK.full())

print("-----------------------------------")
print("BACK TO DENSE EXAMPLE : MTTKRP demo")
print("-----------------------------------")

A = np.array([[3,3], [4,5]])
B = np.array([[3,3], [4,5], [6,7]])

print(f'A: {A}')
print(f'B: {B}')

kr = ttb.khatrirao(A, B)
print(f'Khatri-Rao(A,B): {kr}')

fm0 = np.array([[1,1], [1,1]])
fm1 = np.array([[1,1], [1,1], [1,1]])
fm2 = np.array([[1,1], [1,1], [1,1], [1,1]])
weights = np.array([1,1])
Kd = ttb.ktensor.from_data(weights, [fm0, fm1,fm2])
print(f'Kd.full: {Kd.full()}')

print(f'Show that MTTKRP is the time bottleneck')
Kf = ttb.ktensor.from_function(np.random.random_sample, (200, 30, 40), 2)
kfdcmp, kfinit, kfstats = ttb.cp_als(Kf.full(), 3)

U = [np.ones((2, 2)), np.ones((3, 2)), np.ones(((4, 2)))]
print(f'U: {U}')

print("---------------------------------------------")
print(f'COMPUTE MTTKRP using the pyttb tenmat class:')
print("---------------------------------------------")

Kdmat0 = ttb.tenmat.from_tensor_type(Kd.full(), rdims = np.array([0]))
print(f'-------------------------------------------:')
print(f'Maticized Tensor along way 0 is this matrix:')
print(f'-------------------------------------------:')
print(f'{Kdmat0}')
Kdmat1 = ttb.tenmat.from_tensor_type(Kd.full(), rdims = np.array([1]))
print(f'-------------------------------------------:')
print(f'Maticized Tensor along way 1 is this matrix:')
print(f'-------------------------------------------:')
print(f'{Kdmat1}')
print(f'-------------------------------------------------:')
print(f'Khatri-Rao product of U[1], U[2] (we wouldnt actually compute this):')
print(f'-------------------------------------------------:')
Ukr0 = ttb.khatrirao([U[1], U[2]])
print(f'{Ukr0}')
print(f'-------------------------------------------:')
print(f'Matricized tensor times Khatri-Rao product:')
print(f'-------------------------------------------:')
print(f'{Kdmat0.data @ Ukr0}')
print(f'-------------------------------------------:')

Kdmat1 = ttb.tenmat.from_tensor_type(Kd.full(), rdims = np.array([1]))
print(f'-------------------------------------------:')
print(f'Maticized Tensor along way 1 is this matrix:')
print(f'-------------------------------------------:')
print(f'{Kdmat1}')
Ukr1 = ttb.khatrirao([U[0], U[2]])
print(f'-------------------------------------------:')
print(f'Khatri-Rao product of U[0], U[2] (we wouldnt actually compute this):')
print(f'-------------------------------------------:')
print(f'{Ukr1}')
print(f'-------------------------------------------:')
print(f'Matricized tensor times Khatri-Rao product:')
print(f'-------------------------------------------:')
print(f'{Kdmat1.data @ Ukr1}')
print(f'-------------------------------------------:')

print(f'Next, COMPUTE MTTKRP using the pyttb mttkrp function:')
print(f'this uses indexing to avoid computing the Khatri-Rao product')
print(f'after that, we will break down exactly what an efficient MTTKRP does')
print(f'-----------along way0:---------------------:')
print(f'Kd.mttkrp(U, 0):')
print(f'{Kd.mttkrp(U, 0)}')
print(f'-----------along way1:---------------------:')
print(f'Kd.mttkrp(U, 1):')
print(f'{Kd.mttkrp(U, 1)}')

print(f'-------------------------------------------:')
print(f'-dissection an efficient MTTKRP-------------:')
print(f'-------------------------------------------:')
print(f'The idea is to have a factoring of the tensor in mind,')
print(f'then iteratively multiply each factor rather than computing')
print(f'the Khatri-Rao product')
print(f'-------------------------------------------:')
print(f'recall that this is our tensor.  We show the whole tensor first:')
print(f'{Kd.full()}')
print(f'-------------------------------------------:')
print(f'then show its factor matrices.  These typically start out with some') 
print(f'random init, but we happened to define this example with structure:')
print(f'-------------------------------------------:')
print(f'{Kd}')
print(f'-------------------------------------------:')
print(f'To compute CP_ALS, we could start with U matrices equal to the factor')
print(f'matrices:')
print(f'-------------------------------------------:')
U = Kd.factor_matrices
print(f'U: {U}')
print(f'-------------------------------------------:')
print(f'First, we find the right dimensions for the mttkrp internal computation')
print(f'this will be the number of tensor factors by the number of columns')
print(f'in a factor matrix (these are the same; thus square).')
print(f'There are TWO factors in this tensor even though there are THREE factor matrices')
print(f'The number of weights is equal to the number of factors (TWO) and')
print(f'also equal to the number of columns in a factor matrix.')
print(f'The number of factor matrices corresponds to the number of "toes" in')
print(f'a "chicken foot" (outer product) in the tensor decomposition')
print(f'We initialize a 2 x 2 matrix with the factor weights')
print(f'The ultimate MTTKRP result along way 0 will be the dimension of ')
print(f'factor_matrices[0] @ W =   (2 x 2) x (2 x 2) = (2 x 2)')
print(f'-------------------------------------------:')
print(f'Start with the factor weights W (dim 2 x 2)')
print(f'-------------------------------------------:')
W = np.tile(Kd.weights[:, None], (1, 2))
print(W) 
print(f'-------------------------------------------:')
print(f'we now compute factor_matrix[1].T @ U[1] (full matrix mult)')
print(f'-------------------------------------------:')
W1 = Kd.factor_matrices[1].T @ U[1]
print(f'{W1}')
W *= W1
print(f'-------------------------------------------:')
print(f'running total of W (element-wise multiplication): {W}')
print(f'-------------------------------------------:')
print(f'we next compute factor_matrix[2].T @ U[2] (full matrix mult)')
W2 = Kd.factor_matrices[2].T @ U[2]
W *= W2
print(f'{W2}')
print(f'-------------------------------------------:')
print(f'running total: {W}')
print(f'-------------------------------------------:')
print(f'the final mttkrp result is now factor_matrices[0] @ W (full matrix mult):')
print(f'{Kd.factor_matrices[0] @ W}')

print(f'-------------------------------------------:')
print(f'Repeat for MTTKRP along way 1:')
print(f'-------------------------------------------:')
print(f'U: {U}')
print(f'-------------------------------------------:')
print(f'Again, we find the right dimensions for the mttkrp internal computation')
print(f'this is still the number of tensor factors by the number of columns')
print(f'in a factor matrix. (these are the same, thus square)')
print(f'There are TWO factors in this tensor even though there are THREE factor matrices')
print(f'The number of weights is equal to the number of factors (TWO) and')
print(f'also equal to the number of columns in a factor matrix.')
print(f'The number of factor matrices corresponds to the number of "toes" in')
print(f'a "chicken foot" (outer product) in the tensor decomposition')
print(f'We initialize a 2 x 2 matrix with the factor weights')
print(f'The ultimate MTTKRP result along way 1 will be the dimension of ')
print(f'factor_matrices[0].T @ W =   (3 x 2) x (2 x 2) = (3 x 2)')
print(f'-------------------------------------------:')
W = np.tile(Kd.weights[:, None], (1, 2))
print(W) 
print(f'-------------------------------------------:')
print(f'we now compute factor_matrix[0].T @ U[0] (full matrix mult)')
print(f'-------------------------------------------:')
W1 = Kd.factor_matrices[0].T @ U[0]
print(f'{W1}')
W *= W1
print(f'-------------------------------------------:')
print(f'running total of W (element-wise multiplication): {W}')
print(f'-------------------------------------------:')
print(f'we next compute factor_matrix[2].T @ U[2] (full matrix mult)')
print(f'-------------------------------------------:')
W2 = Kd.factor_matrices[2].T @ U[2]
W *= W2
print(f'{W2}')
print(f'-------------------------------------------:')
print(f'running total: {W}')
print(f'-------------------------------------------:')
print(f'the final mttkrp result is now factor_matrices[1] @ W (full matrix mult):')
print(f'-------------------------------------------:')
print(f'{Kd.factor_matrices[1] @ W}')

tensor: verify all uses of np.reshape

Should be using the following to match reshape in Matlab:

np.reshape(..., order='F')

Since this has led to bugs, make sure all instances are called correctly.

Fix cp_apr warnings during test

TensorToolbox/cp_apr.py:945
  /Users/dmdunla/dev/pyttb/TensorToolbox/cp_apr.py:945: DeprecationWarning: invalid escape sequence \_
    """

tests/test_cp_apr.py::test_cpapr_pqnr
  /Users/dmdunla/dev/pyttb/TensorToolbox/cp_apr.py:798: RuntimeWarning: divide by zero encountered in double_scalars
    tmp_rho = 1 / (tmp_delm.dot(tmp_delg.transpose()))

Implement sumtensor class

%SUMTENSOR Class for implicit sum of other tensors.
%
%SUMTENSOR Methods:
% disp - Command window display of a sumtensor.
% display - Command window display of a sumtensor.
% double - Convert sumtensor to double array.
% full - Convert a sumtensor to a (dense) tensor.
% innerprod - Efficient inner product with a sumtensor.
% isscalar - False for sumtensors.
% mttkrp - Matricized tensor times Khatri-Rao product for sumtensor.
% ndims - Return the number of dimensions for a sumtensor.
% norm - Frobenius norm of a sumtensor.
% plus - Plus for sumtensor.
% size - Size of a sumtensor.
% subsref - Subscript reference for sumtensor.
% sumtensor - Tensor stored as sum of tensors.
% ttv - Tensor times vector for sumtensor.
% uminus - Unary minus for sumtensor.
% uplus - Unary plus for sumtensor.

import_data/sptensor: Check dimensions on import

The import_data method does not check if the indices provided in a sptensor file are valid.

Here is some data from a file:

$ more sptensor2.tns 
sptensor
3 
3 3 1 
3
1 1 1 1
1 3 2 22
2 2 2 3

And here is an example of not checking for valid indices. We can load the data, even though it contains indices that are out of bounds on the third dimension of both of the last two nonzeros in the file:

$ python
>>> import pyttb as ttb
>>> S = ttb.import_data('sptensor2.tns')
>>> S
Sparse tensor of shape (3, 3, 1) with 3 nonzeros 
        [0, 0, 0] = 1.0
        [0, 2, 1] = 22.0
        [1, 1, 1] = 3.0
>>> S[1,1,1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dmdunla/dev/github.com/pyttb/pyttb/sptensor.py", line 1355, in __getitem__
    [subs, shape] = ttb.tt_renumber(subs, self.shape, region)
  File "/Users/dmdunla/dev/github.com/pyttb/pyttb/pyttb_utils.py", line 477, in tt_renumber
    newsubs[:, i], newshape[i] = tt_renumberdim(
  File "/Users/dmdunla/dev/github.com/pyttb/pyttb/pyttb_utils.py", line 514, in tt_renumberdim
    newidx = idx_map[idx]
IndexError: index 1 is out of bounds for axis 0 with size 1

cp_als: align output formatting to Matlab version

Current output:

CP_ALS:
 Iter 0: f = 0.29487863930212066 f-delta = 0.29487863930212066
 Iter 1: f = 0.29566273499455864 f-delta = 0.0007840956924379805
 Iter 2: f = 0.29598343630628876 f-delta = 0.0003207013117301205
 Iter 3: f = 0.29617630790336913 f-delta = 0.0001928715970803685
 Iter 4: f = 0.29631901697531704 f-delta = 0.00014270907194791072
 Iter 5: f = 0.2964251708541269 f-delta = 0.00010615387880985594
 Iter 6: f = 0.2965041651264738 f-delta = 7.899427234692169e-05
 Final f = 0.29650416512647393

Matlab output:

CP_ALS:
 Iter  1: f = 2.948786e-01 f-delta = 2.9e-01
 Iter  2: f = 2.956627e-01 f-delta = 7.8e-04
 Iter  3: f = 2.959834e-01 f-delta = 3.2e-04
 Iter  4: f = 2.961763e-01 f-delta = 1.9e-04
 Iter  5: f = 2.963190e-01 f-delta = 1.4e-04
 Iter  6: f = 2.964252e-01 f-delta = 1.1e-04
 Iter  7: f = 2.965042e-01 f-delta = 7.9e-05
 Final f = 2.965042e-01 

Use standard license

Currently, GitHub does not recognize the LICENSE file, as it was created manually

tensor.__getitem__ error slicing on more than one dimension

>>> shape = (3, 3, 3, 3)
>>> data = np.arange(1, 82)
>>> tensorInstance = ttb.tensor().from_data(data, shape)
>>> T = ttb.tensor().from_data(data, shape)
>>> T[0,0,0,0]
1
>>> T[0,0,0,:]
tensor of shape 3
data[:] = 
[ 1 28 55]
>>> T[0,0,:,:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\users\dmdunla\dev\github.com\pyttb\pyttb\tensor.py", line 1305, in __getitem__
    a = ttb.tensor.from_data(np.transpose(newdata, np.concatenate((kpdims, rmdims))))
  File "<__array_function__ internals>", line 5, in transpose
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 660, in transpose
    return _wrapfunc(a, 'transpose', axes)
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
ValueError: axes don't match array

Add pylint and typing across pyttb

This will probably need to be split across multiple sub-tasks/prs. The proposal and initial support landed here #54. They are not automatically enforced yet

tensor.mttkrp does not work with d-way tensors when d > 3

Below are the current outputs from tensor.mttkrp in pyttb and the Matlab Tensor Toolbox. For dimensions 0, 3, 4, pyttb works, but the other dimensions throw a ValueError during the reshape of the khatrirao product.

pyttb:

>>> T = ttb.tensor.from_data(np.arange(1,np.prod(shape)+1), shape)
>>> U = [];
>>> for s in shape: U.append(np.ones((s,2)))
...
>>> T.mttkrp(U,0)
array([[129600., 129600.],
       [129960., 129960.]])
>>> T.mttkrp(U,1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\dmdunla\dev\github.com\pyttb\pyttb\tensor.py", line 603, in mttkrp
    Ur = np.reshape(ttb.khatrirao(U[0:self.ndims - 2], reverse=True), (szl, 1, R), order='F')
  File "<__array_function__ internals>", line 5, in reshape
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 299, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 58, in _wrapfunc
    return bound(*args, **kwds)
ValueError: cannot reshape array of size 48 into shape (2,1,2)
>>> T.mttkrp(U,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\dmdunla\dev\github.com\pyttb\pyttb\tensor.py", line 603, in mttkrp
    Ur = np.reshape(ttb.khatrirao(U[0:self.ndims - 2], reverse=True), (szl, 1, R), order='F')
  File "<__array_function__ internals>", line 5, in reshape
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 299, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "C:\Users\dmdunla\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 58, in _wrapfunc
    return bound(*args, **kwds)
ValueError: cannot reshape array of size 48 into shape (6,1,2)
>>> T.mttkrp(U,3)
array([[45000., 45000.],
       [48456., 48456.],
       [51912., 51912.],
       [55368., 55368.],
       [58824., 58824.]])
>>> T.mttkrp(U,4)
array([[ 7260.,  7260.],
       [21660., 21660.],
       [36060., 36060.],
       [50460., 50460.],
       [64860., 64860.],
       [79260., 79260.]])

Matlab:

>> T = tensor(1:prod(shape),shape);
>> U = {}; for i = 1:length(shape), U{i} = ones(shape(i),2); end
>> mttkrp(T,U,1)

ans =

      129600      129600
      129960      129960

>> mttkrp(T,U,2)

ans =

       86040       86040
       86520       86520
       87000       87000

>> mttkrp(T,U,3)

ans =

       63270       63270
       64350       64350
       65430       65430
       66510       66510

>> mttkrp(T,U,4)

ans =

       45000       45000
       48456       48456
       51912       51912
       55368       55368
       58824       58824

>> mttkrp(T,U,5)

ans =

        7260        7260
       21660       21660
       36060       36060
       50460       50460
       64860       64860
       79260       79260

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.