pytorch / nestedtensor Goto Github PK

[Prototype] Tools for the concurrent manipulation of variably sized Tensors.

License: BSD 3-Clause "New" or "Revised" License

Python 11.41% Shell 0.17% C++ 9.25% Jupyter Notebook 76.44% Cuda 2.74%

nestedtensor's Introduction

BIG UPDATE: NestedTensor in core!

March 15 2022

As of recently we landed a minimal version of NestedTensor in core PyTorch! Operator coverage and migration of features is possible, but must be backed by issues (feature requests). If you have demand for specific NestedTensor operators, please open a feature request on pytorch/pytorch. For a more impactful submission please include your motivation, use case and list of operators.

The nestedtensor package prototype

If you are here because you ran into a runtime error due to a missing feature or some kind of bug, please open an issue and fill in the appropiate template. If you have general feedback about this prototype you can use our suggested template or just open a free-form issue if you like. Thank you for contributing to this project!

Tutorials

If you are new to this project, we recommend you take a look at our whirlwind introduction to get started.

Autograd support

Due to missing extensibility features of PyTorch nestedtensor currently lacks autograd support. We're actively working on this and recognize that it severely limits the applicability of the project. Please run nestedtensor operations within the inference mode context to prevent any adverse interactions with the autograd system.

For example

sentences = [torch.randn(10, 5), torch.randn(5, 5), torch.randn(9, 5)]
with torch.inference_mode():    
    nt = nestedtensor.nested_tensor(sentences)
    nt.sum(1)

Binaries

Due to the development velocity of PyTorch the nestedtensor project is built on top of and dependent on a fixed, recent PyTorch nightly.

Version	Python	CUDA	Wheels
0.1.1	3.6	CPU-only	nestedtensor
0.1.1	3.7	CPU-only	nestedtensor
0.1.1	3.8	CPU-only	nestedtensor
0.1.1	3.6	CUDA 10.2	nestedtensor
0.1.1	3.7	CUDA 10.2	nestedtensor
0.1.1	3.8	CUDA 10.2	nestedtensor

When installing a binary please specify the corresponding torch nightly link archive to automatically pull in the correct PyTorch nightly.

CPU

pip install https://download.pytorch.org/nestedtensor/whl/nightly/cpu/py3.7/nestedtensor-0.1.1_cpu-cp37-cp37m-linux_x86_64.whl -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

CUDA 10.2

pip install https://download.pytorch.org/nestedtensor/whl/nightly/cu102/py3.7/nestedtensor-0.1.1_cu102-cp37-cp37m-linux_x86_64.whl -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html

Why consider using this? / Dealing with dynamic shapes

In general we batch data for efficiency, but usually batched kernels need, or greatly benefit from, regular, statically-shaped data.

One way of dealing with dynamic shapes then, is via padding and masking. Various projects construct masks that, together with a data Tensor, are used as a representation for lists of dynamically shaped Tensors.

Obviously this is inefficient from a memory and compute perspective if the Tensors within this list are sufficiently diverse.

You can also trace through the codebase where these masks are used and observe the kind of code this approach often leads to. See for example universal_sentence_embedding.

Otherwise we also have one-off operator support in PyTorch that aims to support dynamic shapes via extra arguments such as a padding index. Of course, while these functions are fast and sometimes memory efficient, they don't provide a consistent interface.

Other users simply gave up and started writing for-loops, or discovered that batching didn't help.

We want to have a single abstraction that is consistent, fast, memory efficient and readable and the nestedtensor project aims to provide that.

How does nestedtensor help here?

NestedTensors are a generalization of torch Tensors which eases working with data of different shapes and lengths. In a nutshell, Tensors have scalar entries (e.g. floats) and NestedTensors have Tensor entries. However, note that a NestedTensor is still a Tensor. That means it needs to have a single dimension, single dtype, single device and single layout.

Tensor entry constraints:

Each Tensor constituent is of the dtype, layout and device of the containing NestedTensor.
The dimension of a constituent Tensor must be less than the dimension of the NestedTensor.
An empty NestedTensor is of dimension zero.

Prototype classification

The nestedtensor package is a prototype intended for early stage feedback and testing. It is on the road to a beta classification, but there is no definitive timeline yet. See PyTorch feature classification for what prototype, beta and stale means.

Dependencies

pytorch (installed from nestedtensor/third_party/pytorch submodule)
torchvision (needed for examples and tests)
ipython (needed for examples)
notebook (needed for examples)

Contribution

The project is under active development. If you have a suggestions or found a bug, please file an issue!

nestedtensor's People

Contributors

Stargazers

Watchers

nestedtensor's Issues

Design decision: NestedTensor as TensorImpl or as Tensor-esque type

We should write up a careful comparison of the advantages and disadvantages of writing NestedTensor as a Tensor-esque type as it is currently done or as a subclass of TensorImpl.

Windows support?

Currently I cannot install the package with MSVC toolset 14.11.25503 on Windows 10, Python 3.7 Anaconda. The command causing error is here.

Fix build warnings

Create clear error message that pinpoints faulty constiuent with visualization of tree.

e.g. misaligned dtype in input or not a view of a buffer during C++ construction

Default values are inconsistent

Repro:
creation.nested_tensor([]).is_contiguous() != creation.nested_tensor([], dtype=torch.float).is_contiguous()

[DESIGN] view_as_nested_tensor operation

Our old version of as_nested_tensor allowed users to create NestedTensor as a container that wraps lists of Tensors and allows modifications to trickle down to the constituents. However, this was not in line with torch.as_tensor. Should we introduce a separate abstraction to introduce this view operation?

Cannot install nestedtensor: error: ‘class at::Tensor’ has no member named ‘is_pinned’

Here is the full log. Do i have to install the nightly version of pytorch to install this package?
https://gist.github.com/justanhduc/d5fbf2d6eeca92a660b3db2c2898faef

Rigorous autograd tests for buffer and list implementation

buffer implementation requires deep investigation of at::split. Talk to Greg for some hard test cases.
might want to do in parallel with building training pipelines (e.g. segmentation pipeline)

Revisit .size() to support narrow along nested dimension

Add PyTorch fork with NestedTensor support as submodule

🚀 Feature

Motivation

In order for NestedTensor to move forward as a project it will need access to more and more internals of PyTorch, some of which might not be extensible right now. For example, it might want to extend autograd in uncommon ways to support broadcasting of NestedTensor/Tensor objects or add specialized NestedTensor support to vmap, etc.

Pitch

Add https://github.com/cpuhrsch/pytorchnestedtensor as a submodule pinned to branch nestedtensorsupport.

Change NestedTensor to build and create binaries based on PyTorch built from that branch instead of PyTorch nightlies.

Since this is a prototype this only needs support for Linux and CUDA GPUs.

Alternatives

Other solutions to this entire approach were discussed separately, but we can revisit them in this issue if that background is necessary. As concrete technical alternatives, we could also build a nightly from that fork and then change over this project to depend on that nightly binary.

Additional context

As an additional ask, it'd be great if the package associated to this solution were to raise a conflict error with PyTorch binaries, so that users don't accidentally have two versions of PyTorch installed, just under different names. Otherwise installing NestedTensors could be very frustrating, since it's not necessarily transparent that NestedTensor is building its own version of PyTorch.

Fix test_segmentation_pretrained_test_only test

test_segmentation_pretrained_test_only has dependency on internet. need to either store model or skip if no internet connection available.

Uncomment is_contiguous() and element_size() checks in assertEqual

Add ability to instert/add a NT/Tensor to NT

function.h signatures need audit

For example conv2d doesn't align with aten's definition.

I suspect the same is true for others.

nestedtensor([]).fill_(...) → ()

Investigate if there is any issue in here.

Add support for permute

aten reference: permute.

Semantics here are limited if we're to maintain the view functionality of pytorch.

A user may permute tensor dimensions xor nestedtensor dimensions. Permutations at a per-tensor dimension is simply done with a map operation. Permutations at a nestedtensor dimension requires us to implement a "rotation" within the tree. This is a useful operation in general and may live in the csrc/utils folder.

Nested Tensor of dimension 0

torch.tensor supports creation of a zero dimensional tensor
so

torch.tensor([])
torch.tensor([[], []])

works.

However

nestedtensor.nested_tensor([[], []])

gives a RuntimeError. can we add zero dim support to Nested Tensors as well?

sum() not supported for as_nested_tensor() constructor

Repro:

a = torch.tensor([1, 2, 3])
b = nestedtensor.as_nested_tensor([a])
c = b.sum()

K2 ragged tensor cooperation?

Hi, first of all, thanks for this awesome project. It could be largely useful for sequential machine learning with convnets.
I stumbled upon a related project - k2 that also aims at implementing Ragged tensors as part of its toolkit. Since k2 aims to be eventually compatible with pytorch would it make sense to join forces regarding the nested tensor functionality? Do nestedtensor and k2 know about each other? Regards, Jan

Cannot use nestedtensor in Dataloader?

I got a pickler error in multiprocessing that it cant pickle the nestedtensor object. does it mean i cant use nestedtensor for Dataset?
I guess using nestedtensor in Dataset is important as nestedtensor always copies tensors, which will double the memory usage.

Traceback (most recent call last):
  File "/home/justanhduc/anaconda3/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/justanhduc/anaconda3/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: can't pickle nestedtensor._C._BufferNestedTensor objects

Move nestedtensor.from_tensor_mask to C++

.to_list() returns wrong result for NT with an empty tensor

a = nt.nested_tensor([
           nt.nested_tensor([
               torch.tensor([])
           ])
])
print(a.to_list())

[[tensor([])]]

a = nt.nested_tensor([
       torch.tensor([])
    ])
print(a.to_list())

[[]]

a = nt.nested_tensor([
       torch.tensor(1)
    ])

CRASH

self.assertEqual doesn’t check for iterable and dictionaries

the required manual workaround in test_unbind_dim

Add pickle support

See more info here.

Fix NestedTensor.nested_size() to print torch.Size() for constituents and construct torch.Sizes() on generation.

See

Move NestedTensor.to to C++

constructor ignores dtype

Repro:
a = creation.nested_tensor([], dtype=torch.int64)

nested_tensor constructor crashes

Repro:
nestedtensor.nested_tensor([[]])
Segmentation fault (core dumped)

Regarding the performance of tensorwise

I found out that tensorwise actually just runs a for loop over the nested tensors. I benchmarked tensorwise against map, list comprehension and for loop. (Un)surprisingly, tensorwise performs much slower than the others. Here is the benchmark

import torch as T
import nestedtensor as nt

crit = lambda x, y: T.mean((x - y) ** 2)


@nt.tensorwise()
def loss_nt(a, b):
    return crit(a, b)


def loss_map(a, b):
    return sum(map(crit, a, b)) / len(a)


def loss_for(a, b):
    return sum([crit(a_, b_) for a_, b_ in zip(a, b)]) / len(a)


def loss_expfor(a, b):
    loss = []
    for a_, b_ in zip(a, b):
        loss.append(crit(a_, b_))
    return sum(loss) / len(loss)


p1 = T.arange(64 * 5000 * 3).cuda().view(64, 5000, 3).float()
p2 = T.arange(64 * 5000 * 3).cuda().view(64, 5000, 3).float()

p1_list = list(p1[:, None])
p2_list = list(p2[:, None])

p1_nt = nt.as_nested_tensor(p1_list).cuda()
p2_nt = nt.as_nested_tensor(p2_list).cuda()

start = T.cuda.Event(enable_timing=True)
end = T.cuda.Event(enable_timing=True)

for i in range(100):
    start.record()
    loss_nt(p1_nt, p2_nt)
    end.record()
    T.cuda.synchronize()
    total_nt = start.elapsed_time(end)

    start.record()
    loss_map(p1_list, p2_list)
    end.record()
    T.cuda.synchronize()
    total_map = start.elapsed_time(end)

    start.record()
    loss_for(p1_list, p2_list)
    end.record()
    T.cuda.synchronize()
    total_for = start.elapsed_time(end)

    start.record()
    crit(p1, p2)
    end.record()
    T.cuda.synchronize()
    total = start.elapsed_time(end)

    start.record()
    loss_expfor(p1_list, p2_list)
    end.record()
    T.cuda.synchronize()
    total_expfor = start.elapsed_time(end)

    print(i, total_nt, total_map, total_for, total_expfor, total)

Is it because tensorwise is not in C++ yet?
If the implementation of tensorwise is final then I wonder if tensorwise is just for convenience, not for performance?

Enable codecoverage

One way to do this could be by porting our tests and test gen to pytest.

Re-enable fmod broken by torch function types PR

Fill in the Readme

The construction sign has to go

Add method which will return number of all leaves

There is no easy way to find out the number of tensors in Nested Tensor

setup.py develop ignores changes to .h files on repeating build

Repro:

build everything via "python setup.py develop"
make changes to /private/home/iuriiz/nestedtensor/nestedtensor/csrc/utils/nested_node.h
build again

Current result:
changes are ignored

Expected:
changes are applied

Please note that im using ninja

nested_tensor constructor should be able to accept mixed input

a = nt.nested_tensor([
            nt.nested_tensor([]),
            nt.nested_tensor([
                torch.tensor([1])
            ])
        ])

same for cuda
should succeed

a = nt.nested_tensor([
           [],
           [
               torch.tensor([1])
           ]
       ])

Move NestedTensor.iter to C++

to_tensor_mask() API should allow empty tensors

As of right now, we are not supporting empty tensors in nested tensor when calling to_tensor_mask() method.

Example case:

        nt1 = nt.nested_tensor([
            nt.nested_tensor([
                nt.nested_tensor([
                    torch.tensor([], dtype=torch.float)
                ]),
                nt.nested_tensor([
                    torch.tensor([1], dtype=torch.float),
                    torch.tensor([1], dtype=torch.float)
                ]),
                nt.nested_tensor([
                    torch.tensor([2, 3], dtype=torch.float)
                ]),
            ])
        ])

It's unclear what the result should look like for this nested tensors.

Possible solutions to this problem is to introduce nested_size flag for nested_tensor_from_tensor_mask() method but we need more info to come up with the best solution.

enable assertNotEqual in testing

Masking follow up

exclude scalar case from get_max_size
update test cases for test_ntftm_test_multi_tensor_mix_mask3 (remove empty tensor)
use nested_size() instead of unbind() in get_max_size
look into getting rid of tolist()

Add more tests for squeeze

more irregular nested_sizes

Enable CI for the repo

In order to avoid issues like this one we should enable CI for the project and avoid regressions.

Add torch.stack support support

Failing tests

================================================= test session starts =================================================
platform win32 -- Python 3.7.4, pytest-5.2.0, py-1.8.0, pluggy-0.13.0
rootdir: C:\Users\justanhduc\Downloads\nestedtensor
plugins: arraydiff-0.3, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 23 items

test\test_nested_tensor_autograd.py . [ 4%]
test\test_nested_tensor_class.py .....F.F... [ 52%]
test\test_nested_tensor_functional.py ... [ 65%]
test\test_nested_tensor_masking.py .... [ 82%]
test\test_nested_tensor_tensorwise.py ...F [100%]

====================================================== FAILURES =======================================================
__________________________________________ TestNestedTensor.test_nested_dim ___________________________________________

self = <test_nested_tensor_class.TestNestedTensor testMethod=test_nested_dim>

def test_nested_dim(self):

  nt = torch.nested_tensor([torch.tensor(3)])

test\test_nested_tensor_class.py:99:

C:\ProgramData\Anaconda3\lib\site-packages\nestedtensor\nested\creation.py:102: in nested_tensor
buffer_ = _create_buffer(data)

data = [tensor(3)]

def _create_buffer(data):
    if isinstance(data, torch.Tensor):
        return data.flatten()  # This data will be copied implicitly via cat

  return torch.cat([_create_buffer(data_) for data_ in data], dim=0)

E RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

C:\ProgramData\Anaconda3\lib\site-packages\nestedtensor\nested\creation.py:77: RuntimeError
______________________________________ TestNestedTensor.test_scalar_constructor _______________________________________

self = <test_nested_tensor_class.TestNestedTensor testMethod=test_scalar_constructor>

def test_scalar_constructor(self):

  dim_one_nested_tensor = torch.nested_tensor([1.0])

test\test_nested_tensor_class.py:57:

C:\ProgramData\Anaconda3\lib\site-packages\nestedtensor\nested\creation.py:102: in nested_tensor
buffer_ = _create_buffer(data)

data = [tensor(1.)]

def _create_buffer(data):
    if isinstance(data, torch.Tensor):
        return data.flatten()  # This data will be copied implicitly via cat

  return torch.cat([_create_buffer(data_) for data_ in data], dim=0)

E RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

C:\ProgramData\Anaconda3\lib\site-packages\nestedtensor\nested\creation.py:77: RuntimeError
_____________________________________ TestTensorWise.test_tensorwise_tensor_kwarg _____________________________________

self = <test_nested_tensor_tensorwise.TestTensorWise testMethod=test_tensorwise_tensor_kwarg>

def test_tensorwise_tensor_kwarg(self):

    @nestedtensor.tensorwise(unbind_args=['out'])
    def simple_fn(t1, t2, t3=None):
        result = t1 * 2 + t2
        if t3 is not None:
            result = result + t3
        return result

    a = torch.tensor([1, 2])
    b = torch.tensor([7, 8])
    nt1 = torch.nested_tensor([a, b])
    nt2 = torch.nested_tensor([b, a])

  c1 = simple_fn(a, b, t3=torch.tensor((0.5, 0.7)))

test\test_nested_tensor_tensorwise.py:65:

C:\ProgramData\Anaconda3\lib\site-packages\nestedtensor\nested\utils.py:190: in decorator
return f(*_args, **_kwargs)

t1 = tensor([1, 2]), t2 = tensor([7, 8]), t3 = tensor([0.5000, 0.7000])

@nestedtensor.tensorwise(unbind_args=['out'])
def simple_fn(t1, t2, t3=None):
    result = t1 * 2 + t2
    if t3 is not None:

      result = result + t3

E RuntimeError: expected device cpu and dtype Float but got device cpu and dtype Long

test\test_nested_tensor_tensorwise.py:58: RuntimeError
============================================ 3 failed, 20 passed in 2.31s =============================================

There are a bunch of failing tests due to the creation from scalars. I am not sure if you want to fix it on the API side or the test side so I will leave it to you. The last one is simply a typo so I can make a PR for that if you need.

Move NestedTensor.torch_function to C++

Potential early-adopter ✋, How to use it?

Hi @fmassa!

I like your next NestedTensor abstraction.

Should I take it by cp-ing this into my codebase?
Would you recommend to use this package? Can I pip install it?

My use case

I'm currently passing a dictionary into a couple of modules (loss, layers). However, each module take different tensors.
I did an awful hack in my dataloader to support this.

It seems that your abstraction serves my use case. That's what I understood from your DETR project. Am I wrong?

Cheers 🍻

Move NestedTensor.to_tensor_mask to C++

self.assertEquals ignores dtype and device when comparing tensors

[DESIGN] as_nested_tensor doesnt behave same as as_tensor

as_nested_tensor constructor doesn't behave exact the same way as as_tensor constructor. While at_tensor won't copy the data if input already is a tensor with the same metadata (device/dtype/etc) and will copy otherwise, as_nested_tensor will never copy and will throw an error in a case when it's impossible to construct a nested tensor from the given input without a copy.

Options:
 1. keep everything as it is. we are agreeing that semantics are different
2. keep the behavior the same but rename as_nested_tensor to something else to avoid confusion.

pros and cons should be discussed.

SetItem method for NT

There should be a way to replace tensors inside of an NT via assigning.

error: ‘imag_out’ is not a member of ‘at’

I have this problem when installing the current on master

/home/adnguyen/Downloads/nestedtensor/nestedtensor/csrc/unary.cpp:86:27: error: ‘imag_out’ is not a member of ‘at’
   add_unary(m, c, "imag", at::imag_out);
                           ^~
/home/adnguyen/Downloads/nestedtensor/nestedtensor/csrc/unary.cpp:96:27: error: ‘real_out’ is not a member of ‘at’
   add_unary(m, c, "real", at::real_out);
                           ^~
error: command 'gcc' failed with exit status 1

My environment:
python: 3.7.7.
pytorch: latest nightly version.