Giter Club home page Giter Club logo

performer-pytorch's Introduction

Performer - Pytorch

PyPI version

An implementation of Performer, a linear attention-based transformer variant with a Fast Attention Via positive Orthogonal Random features approach (FAVOR+).

Install

$ pip install performer-pytorch

Then you must run the following, if you plan on training an autoregressive model

$ pip install -r requirements.txt

Usage

Performer Language Model

import torch
from performer_pytorch import PerformerLM

model = PerformerLM(
    num_tokens = 20000,
    max_seq_len = 2048,             # max sequence length
    dim = 512,                      # dimension
    depth = 12,                     # layers
    heads = 8,                      # heads
    causal = False,                 # auto-regressive or not
    nb_features = 256,              # number of random features, if not set, will default to (d * log(d)), where d is the dimension of each head
    feature_redraw_interval = 1000, # how frequently to redraw the projection matrix, the more frequent, the slower the training
    generalized_attention = False,  # defaults to softmax approximation, but can be set to True for generalized attention
    kernel_fn = torch.nn.ReLU(),    # the kernel function to be used, if generalized attention is turned on, defaults to Relu
    reversible = True,              # reversible layers, from Reformer paper
    ff_chunks = 10,                 # chunk feedforward layer, from Reformer paper
    use_scalenorm = False,          # use scale norm, from 'Transformers without Tears' paper
    use_rezero = False,             # use rezero, from 'Rezero is all you need' paper
    ff_glu = True,                  # use GLU variant for feedforward
    emb_dropout = 0.1,              # embedding dropout
    ff_dropout = 0.1,               # feedforward dropout
    attn_dropout = 0.1,             # post-attn dropout
    local_attn_heads = 4,           # 4 heads are local attention, 4 others are global performers
    local_window_size = 256,        # window size of local attention
    rotary_position_emb = True,     # use rotary positional embedding, which endows linear attention with relative positional encoding with no learned parameters. should always be turned on unless if you want to go back to old absolute positional encoding
    shift_tokens = True             # shift tokens by 1 along sequence dimension before each block, for better convergence
)

x = torch.randint(0, 20000, (1, 2048))
mask = torch.ones_like(x).bool()

model(x, mask = mask) # (1, 2048, 20000)

Plain Performer, if you are working with say images or other modalities

import torch
from performer_pytorch import Performer

model = Performer(
    dim = 512,
    depth = 1,
    heads = 8,
    causal = True
)

x = torch.randn(1, 2048, 512)
model(x) # (1, 2048, 512)

Encoder / Decoder - Made possible by Thomas Melistas

import torch
from performer_pytorch import PerformerEncDec

SRC_SEQ_LEN = 4096
TGT_SEQ_LEN = 4096
GENERATE_LEN = 512

enc_dec = PerformerEncDec(
    dim = 512,
    tie_token_embed = True,
    enc_num_tokens = 20000,
    enc_depth = 6,
    enc_heads = 8,
    enc_max_seq_len = SRC_SEQ_LEN,
    dec_num_tokens = 20000,
    dec_depth = 6,
    dec_heads = 8,
    dec_max_seq_len = TGT_SEQ_LEN,
)

src = torch.randint(0, 20000, (1, SRC_SEQ_LEN))
tgt = torch.randint(0, 20000, (1, TGT_SEQ_LEN))
src_mask = torch.ones_like(src).bool()
tgt_mask = torch.ones_like(src).bool()

# train
enc_dec.train()
loss = enc_dec(src, tgt, enc_mask = src_mask, dec_mask = tgt_mask)
loss.backward()

# generate
generate_in = torch.randint(0, 20000, (1, SRC_SEQ_LEN)).long()
generate_out_prime = torch.tensor([[0.]]).long() # prime with <bos> token
samples = enc_dec.generate(generate_in, generate_out_prime, seq_len = GENERATE_LEN, eos_token = 1) # assume 1 is id of stop token
print(samples.shape) # (1, <= GENERATE_LEN) decode the tokens

Standalone self-attention layer with linear complexity in respect to sequence length, for replacing trained full-attention transformer self-attention layers.

import torch
from performer_pytorch import SelfAttention

attn = SelfAttention(
    dim = 512,
    heads = 8,
    causal = False,
).cuda()

x = torch.randn(1, 1024, 512).cuda()
attn(x) # (1, 1024, 512)

Cross attention is similarly

import torch
from performer_pytorch import CrossAttention

attn = CrossAttention(
    dim = 512,
    heads = 8
).cuda()

x = torch.randn(1, 1024, 512).cuda()
context = torch.randn(1, 512, 512).cuda()

attn(x, context = context) # (1, 1024, 512)

To minimize model surgery, you could also simply rewrite the code, so that the attention step is done by the FastAttention module, as follows.

import torch
from performer_pytorch import FastAttention

# queries / keys / values with heads already split and transposed to first dimension
# 8 heads, dimension of head is 64, sequence length of 512
q = torch.randn(1, 8, 512, 64)
k = torch.randn(1, 8, 512, 64)
v = torch.randn(1, 8, 512, 64)

attn_fn = FastAttention(
    dim_heads = 64,
    nb_features = 256,
    causal = False
)

out = attn_fn(q, k, v) # (1, 8, 512, 64)
# now merge heads and combine outputs with Wo

Advanced

At the end of training, if you wish to fix the projection matrices to get the model to output deterministically, you can invoke the following

model.fix_projection_matrices_()

Now your model will have fixed projection matrices across all layers

Citations

@misc{choromanski2020rethinking,
    title   = {Rethinking Attention with Performers},
    author  = {Krzysztof Choromanski and Valerii Likhosherstov and David Dohan and Xingyou Song and Andreea Gane and Tamas Sarlos and Peter Hawkins and Jared Davis and Afroz Mohiuddin and Lukasz Kaiser and David Belanger and Lucy Colwell and Adrian Weller},
    year    = {2020},
    eprint  = {2009.14794},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}
@inproceedings{kitaev2020reformer,
    title       = {Reformer: The Efficient Transformer},
    author      = {Nikita Kitaev and Lukasz Kaiser and Anselm Levskaya},
    booktitle   = {International Conference on Learning Representations},
    year        = {2020},
    url         = {https://openreview.net/forum?id=rkgNKkHtvB}
}
@inproceedings{katharopoulos_et_al_2020,
    author  = {Katharopoulos, A. and Vyas, A. and Pappas, N. and Fleuret, F.},
    title   = {Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention},
    booktitle = {Proceedings of the International Conference on Machine Learning (ICML)},
    year    = {2020}
}
@misc{bachlechner2020rezero,
    title   = {ReZero is All You Need: Fast Convergence at Large Depth},
    author  = {Thomas Bachlechner and Bodhisattwa Prasad Majumder and Huanru Henry Mao and Garrison W. Cottrell and Julian McAuley},
    year    = {2020},
    url     = {https://arxiv.org/abs/2003.04887}
}
@article{1910.05895,
    author  = {Toan Q. Nguyen and Julian Salazar},
    title   = {Transformers without Tears: Improving the Normalization of Self-Attention},
    year    = {2019},
    eprint  = {arXiv:1910.05895},
    doi     = {10.5281/zenodo.3525484},
}
@misc{shazeer2020glu,
    title   = {GLU Variants Improve Transformer},
    author  = {Noam Shazeer},
    year    = {2020},
    url     = {https://arxiv.org/abs/2002.05202}
}
@misc{roy*2020efficient,
    title   = {Efficient Content-Based Sparse Attention with Routing Transformers},
    author  = {Aurko Roy* and Mohammad Taghi Saffar* and David Grangier and Ashish Vaswani},
    year    = {2020},
    url     = {https://arxiv.org/pdf/2003.05997.pdf}
}
@misc{su2021roformer,
    title   = {RoFormer: Enhanced Transformer with Rotary Position Embedding},
    author  = {Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu},
    year    = {2021},
    eprint  = {2104.09864},
    archivePrefix = {arXiv},
    primaryClass = {cs.CL},
    url     = {https://arxiv.org/abs/2104.09864}
}

performer-pytorch's People

Contributors

alexandredey avatar erotemic avatar gulnazaki avatar lucidrains avatar norabelrose avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

performer-pytorch's Issues

Saving checkpoints during training and loading

Hi, thanks for this awesome repo. I would like to know the correct way of saving a AutoregressiveWrapper model?
Is it torch.save(model.net.state_dict(), 'checkpoint.pt')?.
Then how should I load it back for generation?
Thanks.

use performer for image detection

Hello @lucidrains, thanks for this great work!

I am currently working on image detection with the DETR transformers and have issues to train it from scratch (mainly because of GPU resources^^). So I was looking around how to improve the efficiency and found the "Rethinking Attention with Performers" paper. At the moment I am getting into the paper and think I understand the main concept :P
So I was wondering if it is possible to exchange the attention layers of the DETR with the Performer-layers? Do you think this is possible and can solve my problem of training the DETR transformer from scratch?

[Feature] Adding fixed positional embeddings as an option

I believe that, although using learnable positional embeddings is the trend nowadays, it would help to use fixed embeddings (sinusoidal, as in the original implementation), in relatively small dataset scenarios, where it would be hard to learn a meaningful embedding. At least, it would be interesting to compare both methods.

I see you included fixed embeddings in the reformer implementation, but don't you think it would be more efficient to calculate them once during the initialization? (like here)

Btw, I read a cool paper that compares fixed positional ambeddings and the ones learned by BERT, GPT2 and roBERTa.

If you prefer, I could do a PR on this adding the implementation in the above pytorch tutorial but it is no big deal.

Difficult installing on Windows machine

The environment:

Windows 10 20H1,
python 3.8.5
pytorch 1.7.0
Cuda compilation tools, release 10.1, V10.1.243
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\torch\csrc\api\include -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\TH -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -Id:\ProgramData\Miniconda3\envs\pytorch\include -Id:\ProgramData\Miniconda3\envs\pytorch\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\fast_transformers\hashing\hash_cuda.cu -o C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: C:/Users/rains/AppData/Local/Temp/easy_install-fdtfkv90/pytorch-fast-transformers-0.3.0/build/temp.win-amd64-3.8/Release/fast_transformers/hashing/hash_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\torch\csrc\api\include -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\TH -Id:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -Id:\ProgramData\Miniconda3\envs\pytorch\include -Id:\ProgramData\Miniconda3\envs\pytorch\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\fast_transformers\hashing\hash_cuda.cu -o C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\ATen/core/boxing/impl/boxing.h(100): warning: integer conversion resulted in a change of sign

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\ATen/record_function.h(13): warning: modifier is ignored on an enum specifier

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\ATen/core/op_registration/op_whitelist.h(39): warning: integer conversion resulted in a change of sign

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(106): warning: modifier is ignored on an enum specifier

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(138): warning: modifier is ignored on an enum specifier

d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include/torch/csrc/api/include\torch/nn/modules/transformerlayer.h(73): warning: extra ";" ignored

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/include\xutility(4109): error: function "torch::OrderedDict<Key, Value>::Item::operator=(const torch::OrderedDict<std::string, at::Tensor>::Item &) [with Key=std::string, Value=at::Tensor]" (declared implicitly) cannot be referenced -- it is a deleted function
          detected during:
            instantiation of "_OutIt std::_Move_unchecked1(_InIt, _InIt, _OutIt, std::false_type) [with _InIt=torch::OrderedDict<std::string, at::Tensor>::Item *, _OutIt=torch::OrderedDict<std::string, at::Tensor>::Item *]"
(4125): here
            instantiation of "_OutIt std::_Move_unchecked(_InIt, _InIt, _OutIt) [with _InIt=torch::OrderedDict<std::string, at::Tensor>::Item *, _OutIt=torch::OrderedDict<std::string, at::Tensor>::Item *]"
C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/include\vector(1353): here
            instantiation of "std::vector<_Ty, _Alloc>::iterator std::vector<_Ty, _Alloc>::erase(std::vector<_Ty, _Alloc>::const_iterator) [with _Ty=torch::OrderedDict<std::string, at::Tensor>::Item, _Alloc=std::allocator<torch::OrderedDict<std::string, at::Tensor>::Item>]"
d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include\torch/csrc/api/include/torch/ordered_dict.h(419): here
            instantiation of "void torch::OrderedDict<Key, Value>::erase(const Key &) [with Key=std::string, Value=at::Tensor]"
d:/ProgramData/Miniconda3/envs/pytorch/lib/site-packages/torch/include/torch/csrc/api/include\torch/nn/modules/container/parameterdict.h(51): here

2 errors detected in the compilation of "C:/Users/rains/AppData/Local/Temp/tmpxft_00002f64_00000000-10_hash_cuda.cpp1.ii".
hash_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 1516, in _run_ninja_build
    subprocess.run(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 152, in save_modules
    yield saved
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 193, in setup_context
    yield
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 254, in run_setup
    _execfile(setup_script, ns)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 43, in _execfile
    exec(code, globals, locals)
  File "C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\setup.py", line 209, in <module>
  File "C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\setup.py", line 182, in setup_package
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\bdist_egg.py", line 167, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\bdist_egg.py", line 153, in call_command
    self.run_command(cmdname)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\install_lib.py", line 11, in run
    self.build()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\install_lib.py", line 107, in build
    self.run_command('build_ext')
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 340, in run
    self.build_extensions()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 653, in build_extensions
    build_ext.build_extensions(self)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 626, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 1538, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 3, in <module>
    setup(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\install.py", line 67, in run
    self.do_egg_install()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\install.py", line 117, in do_egg_install
    cmd.run(show_deprecation=False)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 408, in run
    self.easy_install(spec, not self.no_deps)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 650, in easy_install
    return self.install_item(None, spec, tmpdir, deps, True)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 697, in install_item
    self.process_distribution(spec, dist, deps)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 741, in process_distribution
    distros = WorkingSet([]).resolve(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\pkg_resources\__init__.py", line 764, in resolve
    dist = best[req.key] = env.best_match(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\pkg_resources\__init__.py", line 1049, in best_match
    return self.obtain(req, installer)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\pkg_resources\__init__.py", line 1061, in obtain
    return installer(requirement)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 669, in easy_install
    return self.install_item(spec, dist.location, tmpdir, deps)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 695, in install_item
    dists = self.install_eggs(spec, download, tmpdir)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 880, in install_eggs
    return self.build_and_install(setup_script, setup_base)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 1150, in build_and_install
    self.run_setup(setup_script, setup_base, args)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\easy_install.py", line 1134, in run_setup
    run_setup(setup_script, args)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 257, in run_setup
    raise
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 193, in setup_context
    yield
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 164, in save_modules
    saved_exc.resume()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 139, in resume
    raise exc.with_traceback(self._tb)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 152, in save_modules
    yield saved
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 193, in setup_context
    yield
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 254, in run_setup
    _execfile(setup_script, ns)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\sandbox.py", line 43, in _execfile
    exec(code, globals, locals)
  File "C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\setup.py", line 209, in <module>
  File "C:\Users\rains\AppData\Local\Temp\easy_install-fdtfkv90\pytorch-fast-transformers-0.3.0\setup.py", line 182, in setup_package
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\bdist_egg.py", line 167, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\bdist_egg.py", line 153, in call_command
    self.run_command(cmdname)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\install_lib.py", line 11, in run
    self.build()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\install_lib.py", line 107, in build
    self.run_command('build_ext')
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 340, in run
    self.build_extensions()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 653, in build_extensions
    build_ext.build_extensions(self)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\distutils\command\build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 626, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "d:\ProgramData\Miniconda3\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py", line 1538, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Causal AutoRegressive Doubt

def causal_linear_attention_noncuda(q, k, v):
    k_cumsum = k.cumsum(dim=-2)
    context = torch.einsum('...nd,...ne->...nde', k, v)
    context = context.cumsum(dim=-3)
    context /= k_cumsum.unsqueeze(dim=-1)
    out = torch.einsum('...nde,...nd->...ne', context, q)
    return out

I'm not able to understand why context is being divided by k_cumsum? I understand the one done in bidirectional as it is mentioned in the paper.

D_inv = 1. / torch.einsum('...nd,...d->...n', q, k.sum(dim = -2))

but it didn't mention anything about the normalization of the causal attention

Can you please help me understand it? Any help would be highly appreciated. :)

Inverse of renormalization matrix being used?

Hi! Thanks for the speedy reference implementation in Pytorch.

I noticed something odd: I think you're multiplying by D (in eq 4) not D^-1 in linear_attention:

def linear_attention(q, k, v):
    D_inv = torch.einsum('...nd,...d->...n', q, k.sum(dim = -2))
    context = torch.einsum('...nd,...ne->...de', k, v)
    out = torch.einsum('...de,...nd,...n->...ne', context, q, D_inv)
    return out

The tensor named D_inv is actually D. Unless I'm missing something obvious, which hopefully I am.

[tagging @ncilfone because we discussed this recently]

Is it slower than original bert when training?

As paper say, i transferred weights of BERT into performer, and fine-tune it on some down-stream tasks. It seems much more slow than BERT training, besides i already installed fast transformers for CUDA. Is it common or i did something wrong?

Causal model running on GPU

Hi, I am trying to run the LM model with the causal = True on the GPU but I am getting some issues.

I am trying to run the following example:

import torch
from torch import nn
from performer_pytorch import PerformerLM

model = PerformerLM(
    num_tokens = 20000,
    max_seq_len = 2048,             # max sequence length
    dim = 512,                      # dimension
    depth = 6,                      # layers
    heads = 8,                      # heads
    causal = True,                 # auto-regressive or not
    nb_features = 256,              # number of random features, if not set, will default to (d * log(d)), where d is the dimension of each head
    generalized_attention = False,  # defaults to softmax approximation, but can be set to True for generalized attention
    kernel_fn = nn.ReLU(),          # the kernel function to be used, if generalized attention is turned on, defaults to Relu
    reversible = True,              # reversible layers, from Reformer paper
    ff_chunks = 10,                 # chunk feedforward layer, from Reformer paper
).cuda()

x = torch.randint(0, 20000, (1, 2048)).cuda()
model(x) # (1, 2048, 20000)

And I am getting this error:

Traceback (most recent call last):
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-a530c03a976e>", line 20, in <module>
    model(x) # (1, 2048, 20000)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 253, in forward
    x = self.performer(x, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 238, in forward
    return self.net(x, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 160, in forward
    out =  _ReversibleFunction.apply(x, blocks, args)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 113, in forward
    x = block(x, **kwarg)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 65, in forward
    y1 = x1 + self.f(x2, record_rng=self.training, **f_args)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 40, in forward
    return self.net(*args, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 170, in forward
    return self.fn(self.norm(x), **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 216, in forward
    out = self.fast_attention(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 159, in forward
    out = attn_fn(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 110, in causal_linear_attention
    return CausalDotProduct.apply(q, k, v)
  File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/fast_transformers/causal_product/__init__.py", line 48, in forward
    product
TypeError: 'NoneType' object is not callable

My system has:
TITAN RTX
CUDA Version: 10.2
Driver Version: 440.100

FixNorm alongside ScaleNorm

Hello there,

Do you think it would be possible to add FixNorm in conjuction with ScaleNorm? In the Transformers without tears paper results it does improve the results and it seems straightforward to add. In the tied_embedding case one just has to change the word embedding initialization to uniform with range [−0.01, 0.01] and then do l2-normalization, and otherwise do the scaling in the final linear layer, if I get it right.

Also, shouldn't the final normalization layer in the PerformerLM differ for the various normalization methods?

Applying decoder input mask?

Hi,
I'm trying to implement basic transformer architecture "Attention is all you need", but replacing MultiHeadAttention with the performer_pytorch.SelfAttention, however the expected mask for decoder input is apparently not of shape n x n?
I've tried different setups, but no success. Any tips/ideas? I've only glanced through the paper.

Any performance comparison on standard benchmarks?

Dear author, very appreciate about the helpful toolbox for performer in pytorch.

Do you compare the performance of different transformer variants on standard benchmarks under different sequence length?
What is the really improvement of performer?
I believe if you can report these comparison results, it will be really helpful.

Load weights of transformer into PerformerLM

Hi, is there an explanation (or even implementation) on how to load pre-trained Bert (or RoBerta or GPT-2 etc.) model weights into PerformerLM model? According to the paper "Backwards compatibility with pretrained models is available as a benefit from softmax approximation, via small finetuning (required due to error propagation)"
If I correctly understand in your readme there is a hint that SelfAttention should map to the transformer self-attention layers, but how and what finetuning is required?
e.g.
when SelfAttention has:
(to_qkv): Linear(in_features=768, out_features=2304, bias=False)
to_out): Linear(in_features=768, out_features=768, bias=True)
and Transformer has:
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): FusedLayerNorm(torch.Size([768]), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
Thanks.

SelfAttention layer seems to have large error relative to nn.MultiheadAttention?

Hi,

I started trying to use this and the first thing I did was compare the error between the performer_pytorch.SelfAttention layer and torch.nn.MultiheadAttention for different sizes of the random feature map. I was a little surprised to see that it never went below 100% relative error.
image
Am I doing something wrong? This analysis was done using this colab notebook: https://colab.research.google.com/drive/1vemlPOySWtDdB2Xfm7YCE7--PYtIbelS?usp=sharing

Thanks!

Bug in FastAttention.forward()

Line 171 to 174 of performer_pytorch.py, forward() creates a projection_matrix each time it runs, while saves the projection_matrix after creating it first. I think there should be a line between line 172 and 173:
self.projection_matrix = projection_matrix

A small question regarding `softmax_kernel`

First things first, greate repo.

I'm trying to understand the renormalizing in softmax_kernel, tho:


if is_query:
    data_dash = ratio * (
    torch.exp(data_dash - diag_data -
                       torch.max(data_dash, dim=-1, keepdim=True).values) + eps)
else:
    data_dash = ratio * (
                            torch.exp(data_dash - diag_data - torch.max(data_dash)) + eps)

In this segment of code, an argument is_query is used to distinguish the difference in computation.

I reckon that this part is to alleviate numerical problems. I wonder why the computation for query features and key features should be different (in that the max op is different)?

Really appreciate it if you could shed a light on this question so I could understand this.

Regarding DDP and reversible networks

Hi, I'm trying to figure out how to combine DDP with setting the network to be reversible.

My code basically looks like this:

import pytorch_lightning as pl
from performer_pytorch import Performer
...
model = nn.Sequential([...,Performer(...,reversible=True)])
trainer = pl.Trainer(...
                    distributed_backend='ddp',
                    ...)
trainer.fit(model,train_loader,val_loader)

Now all combinations work for me (ddp/not reversible, not ddp/reversible, not ddp/not reversible) except for ddp and reversible.

The error I get is:

RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons:

  1. Use of a module parameter outside the forward function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes
  2. Reused parameters in multiple reentrant backward passes. For example, if you use multiple checkpoint functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases yet.

I've seen multiple people have similar issues: huggingface/transformers#7160 ,pytorch/pytorch#46166 , tatp22/linformer-pytorch#23

Do you have any suggestion for how to deal with this issue? Im not really familiar with the inner workings of DDP and the autograd engine, so I'm not sure how to fix this myself.

Input ordering is not explicitly stated

Since the pytorch implementation of a transformer expects input ordering to be (sequence_length, batch_size, feature_dimension) is this also the case for the performer implementation?

For instance, using the SelfAttention as such:
attn = SelfAttention(
dim = 512,
heads = 8,
causal = False,
).cuda()

Should it be:
x = torch.randn(batch_size, sequence_length, feature_dimension).cuda()

or:
x = torch.randn(sequence_length, batch_size, feature_dimension).cuda()

attn(x)

?

definition of layer_drop()

Hi,
it seems that the function layer_drop(..) used by forward(..) of the class SequentialSequence is not defined?

Question: Scaling down number of random features depending on number of heads?

The theory for the paper gives a result which gives some guarantees for nb_features = O(dim*log(dim)).
When using multiple heads, e.g. dim = 512, heads = 8, you would get a lower dimensionality per head, is it then reasonable to scale the dimension of nb_features = O((dim/heads)*log(dim/heads)) ? Or is the variance too high when the number of features gets too low? Do you have any intuition for this, cause I'm feeling a bit unsure.

Relative position encoding

Is this architecture incompatible with relative position encoding a la Shaw et al 2018 or Transformer XL?

Question: torch.max term used in `softmax_kernel`

Hi! Thanks for the implementation of performer-pytorch. It really helps!

I feel confused about the torch.max terms in softmax_kernel as below. I reckon they are for some kind of normalization, but I did not find the corresponding equations in the original paper. Could you please help explain this term? And why need to discriminate between query and key?

    if is_query:
        data_dash = ratio * (
            torch.exp(data_dash - diag_data -
                    torch.max(data_dash, dim=-1, keepdim=True).values) + eps)
    else:
        data_dash = ratio * (
            torch.exp(data_dash - diag_data - torch.max(data_dash)) + eps)

pip install error

Windows 10
Started with python 3.9, migrated to Adaconda 3.8
Started w/ Pytorch 1.7 w/ Cuda 11, migrated to 1.6 w/ cuda 10.2

Can't for the life of me get these to compile. Always the same errors.

(base) C:\Users\David>pip install performer-pytorch --no-cache-dir
Collecting performer-pytorch
Downloading performer_pytorch-0.2.0-py3-none-any.whl (9.0 kB)
Requirement already satisfied: torch>=1.6 in c:\users\david\miniconda3\lib\site-packages (from performer-pytorch) (1.6.0)
Collecting pytorch-fast-transformers>=0.3.0
Downloading pytorch-fast-transformers-0.3.0.tar.gz (78 kB)
|████████████████████████████████| 78 kB 1.5 MB/s
Collecting einops>=0.3
Downloading einops-0.3.0-py2.py3-none-any.whl (25 kB)
Collecting future
Downloading future-0.18.2.tar.gz (829 kB)
|████████████████████████████████| 829 kB 504 kB/s
Requirement already satisfied: numpy in c:\users\david\miniconda3\lib\site-packages (from torch>=1.6->performer-pytorch) (1.19.2)
Building wheels for collected packages: pytorch-fast-transformers, future
Building wheel for pytorch-fast-transformers (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'C:\Users\David\miniconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"'; file='"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\David\AppData\Local\Temp\pip-wheel-7to4wnke'
cwd: C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers
Complete output (245 lines):
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.8
creating build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\masking.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\transformers.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\utils.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\weight_mapper.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers_init_.py -> build\lib.win-amd64-3.8\fast_transformers
creating build\lib.win-amd64-3.8\fast_transformers\aggregate
copying fast_transformers\aggregate_init_.py -> build\lib.win-amd64-3.8\fast_transformers\aggregate
creating build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\causal_linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\clustered_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\conditional_full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\exact_topk_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\improved_clustered_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\improved_clustered_causal_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\local_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\reformer_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\attention
creating build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry\registry.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry\spec.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry_init_.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
creating build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\attention_builders.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\base.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\transformer_builders.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders_init_.py -> build\lib.win-amd64-3.8\fast_transformers\builders
creating build\lib.win-amd64-3.8\fast_transformers\causal_product
copying fast_transformers\causal_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\causal_product
creating build\lib.win-amd64-3.8\fast_transformers\clustering
copying fast_transformers\clustering_init_.py -> build\lib.win-amd64-3.8\fast_transformers\clustering
creating build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\event.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\event_dispatcher.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\filters.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events_init_.py -> build\lib.win-amd64-3.8\fast_transformers\events
creating build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps\base.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps\fourier_features.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps_init_.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
creating build\lib.win-amd64-3.8\fast_transformers\hashing
copying fast_transformers\hashing_init_.py -> build\lib.win-amd64-3.8\fast_transformers\hashing
creating build\lib.win-amd64-3.8\fast_transformers\local_product
copying fast_transformers\local_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\local_product
creating build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent\transformers.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent_utils.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
creating build\lib.win-amd64-3.8\fast_transformers\sparse_product
copying fast_transformers\sparse_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\sparse_product
creating build\lib.win-amd64-3.8\fast_transformers\clustering\hamming
copying fast_transformers\clustering\hamming_init_.py -> build\lib.win-amd64-3.8\fast_transformers\clustering\hamming
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention
copying fast_transformers\recurrent\attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
running build_ext
C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py:270: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified
warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))
building 'fast_transformers.hashing.hash_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\hashing
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_hash_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\hashing\hash_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.aggregate.aggregate_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\aggregate
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\aggregate\aggregate_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate/aggregate_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=aggregate_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_aggregate_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate/aggregate_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\aggregate\aggregate_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.clustering.hamming.cluster_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\clustering
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\clustering\hamming
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\clustering\hamming\cluster_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming/cluster_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=cluster_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\clustering\hamming\cluster_cpu.cpp(161): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_cluster_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming/cluster_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\clustering\hamming\cluster_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.sparse_product.sparse_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\sparse_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\sparse_product\sparse_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/sparse_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=sparse_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_sparse_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/sparse_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\sparse_product\sparse_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.sparse_product.clustered_sparse_product_cpu' extension
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\sparse_product\clustered_sparse_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/clustered_sparse_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=clustered_sparse_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_clustered_sparse_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/clustered_sparse_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.causal_product.causal_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\causal_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\causal_product\causal_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product/causal_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=causal_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_causal_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product/causal_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\causal_product\causal_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.local_product.local_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\local_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\local_product\local_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product/local_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=local_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_local_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product/local_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\local_product\local_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.hashing.hash_cuda' extension
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cuda.cu -o C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: C:/Users/David/AppData/Local/Temp/pip-install-29aa16lb/pytorch-fast-transformers/build/temp.win-amd64-3.8/Release/fast_transformers/hashing/hash_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cuda.cu -o C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
C:/Users/David/miniconda3/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\ATen/record_function.h(18): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(483): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(496): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(510): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(523): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(97): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(126): warning: modifier is ignored on an enum specifier

4 errors detected in the compilation of "C:/Users/David/AppData/Local/Temp/tmpxft_00000890_00000000-10_hash_cuda.cpp1.ii".
hash_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _run_ninja_build
subprocess.run(
File "C:\Users\David\miniconda3\lib\subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py", line 209, in
setup_package()
File "C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py", line 182, in setup_package
setup(
File "C:\Users\David\miniconda3\lib\site-packages\setuptools_init_.py", line 153, in setup
return distutils.core.setup(**attrs)
File "C:\Users\David\miniconda3\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\David\miniconda3\lib\site-packages\wheel\bdist_wheel.py", line 290, in run
self.run_command('build')
File "C:\Users\David\miniconda3\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\David\miniconda3\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "C:\Users\David\miniconda3\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\David\miniconda3\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
_build_ext.run(self)
File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 340, in run
self.build_extensions()
File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 649, in build_extensions
build_ext.build_extensions(self)
File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "C:\Users\David\miniconda3\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 528, in build_extension
objects = self.compiler.compile(sources,
File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 622, in win_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1228, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1529, in _run_ninja_build
raise RuntimeError(message)
RuntimeError: Error compiling objects for extension

ERROR: Failed building wheel for pytorch-fast-transformers
Running setup.py clean for pytorch-fast-transformers
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491062 sha256=1b0fe17fb39afd4d0503db42ed21c6036471913fffd551b09b5110cfea5bd51b
Stored in directory: C:\Users\David\AppData\Local\Temp\pip-ephem-wheel-cache-7qwnu102\wheels\8e\70\28\3d6ccd6e315f65f245da085482a2e1c7d14b90b30f239e2cf4
Successfully built future
Failed to build pytorch-fast-transformers
Installing collected packages: pytorch-fast-transformers, einops, performer-pytorch, future
Running setup.py install for pytorch-fast-transformers ... error
ERROR: Command errored out with exit status 1:
command: 'C:\Users\David\miniconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"'; file='"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\David\AppData\Local\Temp\pip-record-ectzpnnx\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\David\miniconda3\Include\pytorch-fast-transformers'
cwd: C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers
Complete output (247 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.8
creating build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\masking.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\transformers.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\utils.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers\weight_mapper.py -> build\lib.win-amd64-3.8\fast_transformers
copying fast_transformers_init_.py -> build\lib.win-amd64-3.8\fast_transformers
creating build\lib.win-amd64-3.8\fast_transformers\aggregate
copying fast_transformers\aggregate_init_.py -> build\lib.win-amd64-3.8\fast_transformers\aggregate
creating build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\causal_linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\clustered_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\conditional_full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\exact_topk_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\improved_clustered_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\improved_clustered_causal_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\local_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention\reformer_attention.py -> build\lib.win-amd64-3.8\fast_transformers\attention
copying fast_transformers\attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\attention
creating build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry\registry.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry\spec.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
copying fast_transformers\attention_registry_init_.py -> build\lib.win-amd64-3.8\fast_transformers\attention_registry
creating build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\attention_builders.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\base.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders\transformer_builders.py -> build\lib.win-amd64-3.8\fast_transformers\builders
copying fast_transformers\builders_init_.py -> build\lib.win-amd64-3.8\fast_transformers\builders
creating build\lib.win-amd64-3.8\fast_transformers\causal_product
copying fast_transformers\causal_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\causal_product
creating build\lib.win-amd64-3.8\fast_transformers\clustering
copying fast_transformers\clustering_init_.py -> build\lib.win-amd64-3.8\fast_transformers\clustering
creating build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\event.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\event_dispatcher.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events\filters.py -> build\lib.win-amd64-3.8\fast_transformers\events
copying fast_transformers\events_init_.py -> build\lib.win-amd64-3.8\fast_transformers\events
creating build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps\base.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps\fourier_features.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
copying fast_transformers\feature_maps_init_.py -> build\lib.win-amd64-3.8\fast_transformers\feature_maps
creating build\lib.win-amd64-3.8\fast_transformers\hashing
copying fast_transformers\hashing_init_.py -> build\lib.win-amd64-3.8\fast_transformers\hashing
creating build\lib.win-amd64-3.8\fast_transformers\local_product
copying fast_transformers\local_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\local_product
creating build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent\transformers.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent_utils.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
copying fast_transformers\recurrent_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent
creating build\lib.win-amd64-3.8\fast_transformers\sparse_product
copying fast_transformers\sparse_product_init_.py -> build\lib.win-amd64-3.8\fast_transformers\sparse_product
creating build\lib.win-amd64-3.8\fast_transformers\clustering\hamming
copying fast_transformers\clustering\hamming_init_.py -> build\lib.win-amd64-3.8\fast_transformers\clustering\hamming
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention
copying fast_transformers\recurrent\attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
copying fast_transformers\recurrent\attention\cross_attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\cross_attention
creating build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\attention_layer.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\full_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention\linear_attention.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
copying fast_transformers\recurrent\attention\self_attention_init_.py -> build\lib.win-amd64-3.8\fast_transformers\recurrent\attention\self_attention
running build_ext
C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py:270: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified
warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))
building 'fast_transformers.hashing.hash_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\hashing
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_hash_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\hashing\hash_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing\hash_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.aggregate.aggregate_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\aggregate
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\aggregate\aggregate_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate/aggregate_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=aggregate_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_aggregate_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate/aggregate_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\aggregate\aggregate_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/aggregate\aggregate_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.clustering.hamming.cluster_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\clustering
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\clustering\hamming
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\clustering\hamming\cluster_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming/cluster_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=cluster_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\clustering\hamming\cluster_cpu.cpp(161): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_cluster_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming/cluster_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\clustering\hamming\cluster_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/clustering/hamming\cluster_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.sparse_product.sparse_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\sparse_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\sparse_product\sparse_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/sparse_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=sparse_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_sparse_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/sparse_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\sparse_product\sparse_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\sparse_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.sparse_product.clustered_sparse_product_cpu' extension
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\sparse_product\clustered_sparse_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/clustered_sparse_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=clustered_sparse_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_clustered_sparse_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product/clustered_sparse_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/sparse_product\clustered_sparse_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.causal_product.causal_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\causal_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\causal_product\causal_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product/causal_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=causal_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_causal_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product/causal_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\causal_product\causal_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/causal_product\causal_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.local_product.local_product_cpu' extension
creating C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers\local_product
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\local_product\local_product_cpu.cpp /FoC:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product/local_product_cpu.obj -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=local_product_cpu -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-fopenmp'
cl : Command line warning D9002 : ignoring unknown option '-ffast-math'
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\David\miniconda3\lib\site-packages\torch\lib /LIBPATH:C:\Users\David\miniconda3\libs /LIBPATH:C:\Users\David\miniconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_local_product_cpu C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product/local_product_cpu.obj /OUT:build\lib.win-amd64-3.8\fast_transformers\local_product\local_product_cpu.cp38-win_amd64.pyd /IMPLIB:C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.lib
Creating library C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.lib and object C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/local_product\local_product_cpu.cp38-win_amd64.exp
Generating code
Finished generating code
building 'fast_transformers.hashing.hash_cuda' extension
Emitting ninja build file C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cuda.cu -o C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: C:/Users/David/AppData/Local/Temp/pip-install-29aa16lb/pytorch-fast-transformers/build/temp.win-amd64-3.8/Release/fast_transformers/hashing/hash_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\David\miniconda3\lib\site-packages\torch\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\David\miniconda3\lib\site-packages\torch\include\TH -IC:\Users\David\miniconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\David\miniconda3\include -IC:\Users\David\miniconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" -c C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\fast_transformers\hashing\hash_cuda.cu -o C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\build\temp.win-amd64-3.8\Release\fast_transformers/hashing/hash_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -arch=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cuda -D_GLIBCXX_USE_CXX11_ABI=0
C:/Users/David/miniconda3/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\ATen/record_function.h(18): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(483): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(496): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(510): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/jit/api/module.h(523): error: a member with an in-class initializer must be const

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(97): warning: modifier is ignored on an enum specifier

C:/Users/David/miniconda3/lib/site-packages/torch/include\torch/csrc/autograd/profiler.h(126): warning: modifier is ignored on an enum specifier

4 errors detected in the compilation of "C:/Users/David/AppData/Local/Temp/tmpxft_00003088_00000000-10_hash_cuda.cpp1.ii".
hash_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _run_ninja_build
    subprocess.run(
  File "C:\Users\David\miniconda3\lib\subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py", line 209, in <module>
    setup_package()
  File "C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py", line 182, in setup_package
    setup(
  File "C:\Users\David\miniconda3\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "C:\Users\David\miniconda3\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\Users\David\miniconda3\lib\site-packages\setuptools\command\install.py", line 61, in run
    return orig.install.run(self)
  File "C:\Users\David\miniconda3\lib\distutils\command\install.py", line 545, in run
    self.run_command('build')
  File "C:\Users\David\miniconda3\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\Users\David\miniconda3\lib\distutils\command\build.py", line 135, in run
    self.run_command(cmd_name)
  File "C:\Users\David\miniconda3\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\Users\David\miniconda3\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\Users\David\miniconda3\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 340, in run
    self.build_extensions()
  File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 649, in build_extensions
    build_ext.build_extensions(self)
  File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "C:\Users\David\miniconda3\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "C:\Users\David\miniconda3\lib\distutils\command\build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 622, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1228, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "C:\Users\David\miniconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1529, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error compiling objects for extension
----------------------------------------

ERROR: Command errored out with exit status 1: 'C:\Users\David\miniconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"'; file='"'"'C:\Users\David\AppData\Local\Temp\pip-install-29aa16lb\pytorch-fast-transformers\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\David\AppData\Local\Temp\pip-record-ectzpnnx\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\David\miniconda3\Include\pytorch-fast-transformers' Check the logs for full command output.

Performance gain replacing original attention to fast attention in this repo?

I find a space-wise performance gain when I have long sequence with small feature dimension for sure.

But I do not find any time-wise performance gain, and under the same condition, I find loss drop is slower under the same conditions.

This is probably because my dataset and task does not fit right for gaining performance by using performer(sequence not long enough, task not hard enough, etc.),

but I just want to know if anyone experienced tangible performance gain by replacing to performer-version fast attention.

For your information, I merely replaced this simple attention mechanism:

def attention(query, key, value, mask=None, dropout=None):
    "Compute 'Scaled Dot Product Attention'"
    d_k = query.size(-1)
    scores = torch.matmul(query, key.transpose(-2, -1)) \
             / math.sqrt(d_k)
    if mask is not None:
        scores = scores.masked_fill(mask == 0, -1e9)
    p_attn = F.softmax(scores, dim = -1)
    if dropout is not None:
        p_attn = dropout(p_attn)
    return torch.matmul(p_attn, value), p_attn

to FastAttention in this repo.

Thank you!

There are no tests in this project, use_rezero=True is non-functional

Tests are needed to validate that models can train in various configurations.
I built and ran simple tests (trying to get authorization to contribute as a PR) and found that use_rezero=True kills the gradient and results in a performer model that cannot learn. The fix consists in initializing the rezero parameter with a small value, but not zero (e.g., 1E-3 works in my tests). Zero prevents any signal to pass through the module so that the parameter will never change from zero.

Extra FF when using cross attention

Hello Phil,

I have noticed that when using cross attention a new block (with attention and a FeedForward layer is added), while only an attention layer should be added between the self attention and the FF layer.

Is there any reason for this?

Suggestion: Renormalization step for linear attention

When I used generalized attention without normalizing the the attention matrix I got some crazy big losses (because norms explode inside the network). Now I'm using a renormalization step (basically computing D^-1 as described in the paper):

def linear_attention(q, k, v,renormalize=True):
    Dinv = 1.0/torch.einsum('...nd,...d -> ...n',q,torch.einsum('...nd->...d',k))
    context = torch.einsum('...nd,...ne->...de', k, v)
    out = torch.einsum('...de,...nd->...ne', context, q)
    out = torch.einsum('...n,...ne->...ne',Dinv,out)
    return out

To renormalize. Obviously my code is stupidly written (not using the kwarg at all), but I think its a nice feature.
I might have messed up the dimensions and stuff here, so please tell me if something doesnt make sense ;)

Redrawing normalized samples using QR slows down training

Doing the QR-decomposition:

def orthogonal_matrix_chunk(cols, device = None):
unstructured_block = torch.randn((cols, cols), device = device)
q, _ = torch.qr(unstructured_block, some = True)
return q.t()

Slows down training substantially (at least for batch sizes of ~4). For example, in my own experiments I get ~2.5 batches/s per GPU without redrawing, and ~1.4 batches/s with redrawing.

I found one solution from pytorch GP, which dispatches to CPU for small QR factorizations:

cornellius-gp/gpytorch#1224

Perhaps a similar strategy could be used? I think num_cols should never really be more than about ~100 though, so perhaps you should always use cpu here?

Causal for images

Hello, thank you for the repo!
I have a question, you gave the example for "Plain Performer, if you are working with say images or other modalities". There, your performer parameter "causal" is set to True. Can please explain why if you can of course? Is it somehow autoregressive task?
Thanks!

[Feature] EncoderDecoder framework, similar to ReformerEncDec

Hello Phil,

Nice job on this great architecture. I want to use it as an Encoder Decoder within Deepspeed, so I am thinking of writing a wrapper similar to the one you did for Reformer. Do you have any tips on what to pay attention (no pun intended) and if I need to use padding as in Autopadder?

Thanks

wrong implementation for autoregressive self-attention

Hi, I found that you used fast_transfomers's CUDA Kernel, but it does not contain normalization part, which needs a cumsum outside the CausalDotProduct (in causal_linear_attention).
If I didn't miss something, the result of your code should be wrong... But I am not 100% sure.

RuntimeError: CUDA error: no kernel image is available for execution on the device

Hi, thanks for the codes!
I run into a problem when executing:

import torch
from performer_pytorch import Performer

model = Performer(
    dim = 512,
    depth = 1,
    heads = 8,
    causal = True
).cuda()

x = torch.randn(1, 2048, 512).cuda()
y = model(x) # (1, 2048, 512)
x.shape, y.shape

It works fine if I don't use cuda, but when I use cuda it said:


RuntimeError Traceback (most recent call last)
in
10
11 x = torch.randn(1, 2048, 512).cuda()
---> 12 y = model(x) # (1, 2048, 512)
13 x.shape, y.shape

/ext3/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),

/ext3/miniconda3/lib/python3.8/site-packages/performer_pytorch/performer_pytorch.py in forward(self, x, **kwargs)
341
342 def forward(self, x, **kwargs):
--> 343 return self.net(x, **kwargs)
344
345 class PerformerLM(nn.Module):

/ext3/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),

/ext3/miniconda3/lib/python3.8/site-packages/performer_pytorch/reversible.py in forward(self, x, **kwargs)
136
137 for (f, g), (f_args, g_args) in layers_and_args:
--> 138 x = x + f(x, **f_args)
139 x = x + g(x, **g_args)
140 return x

/ext3/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),

/ext3/miniconda3/lib/python3.8/site-packages/performer_pytorch/performer_pytorch.py in forward(self, x, **kwargs)
215 self.fn = fn
216 def forward(self, x, **kwargs):
--> 217 return self.fn(self.norm(x), **kwargs)
218
219 class Chunk(nn.Module):

/ext3/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),

/ext3/miniconda3/lib/python3.8/site-packages/performer_pytorch/performer_pytorch.py in forward(self, x, context, mask, context_mask)
297 attn_outs.append(out)
298
--> 299 out = torch.cat(attn_outs, dim = 1)
300 out = rearrange(out, 'b h n d -> b n (h d)')
301 out = self.to_out(out)

RuntimeError: CUDA error: no kernel image is available for execution on the device

Floating point exception @ loss.backward()

Hello, thank you for this rich library for pytorch users.
I'm implementing a seq2seq system for OCR and I tried to use your implementation for the decoder side of it.
the encoder is a CRNN based (CNN + BLSTM). i added the PerformerLM(..) wrapped with AutoregressiveWrapper(..)
The loss returned from the model seems to be working good :
tensor(4.2084, device='cuda:0', grad_fn=)
However, the loss.backword() raise this exception:
Floating point exception (core dumped)
Is there something wrong with my implementation since the encoder cannot be wrapped with the same Autoregressive function?
Many thanks

Decoder randomly outputs NaN tensor.

Hi,

I just noticed misbehavior of decoder, seems to output NaN tensor randomly.

  • Problem
    AutoregressiveWrapper.generate randomly outputs NaN tensor and fails with RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

  • How to Reproduce the bug
    Set decoder to cuda device dec = PerformerLM(**dec_kwargs).to('cuda:1'), and repeat decoding inside the AutoregressiveWrapper:

    • performer_pytorch/autoregressive_wrapper.py(63)
    ...
    for _ in range(seq_len):
        x = out[:, -self.max_seq_len:]
        input_mask = input_mask[:, -self.max_seq_len:]
        logits = self.net(x, mask=input_mask, **kwargs)[:, -1, :] <<-- HERE
    
    • output
    (Pdb) self.net(x, mask=input_mask, **kwargs)[:, -1, :]
    tensor([[-0.6147,  0.4647,  0.8009,  ..., -0.3772, -0.5126, -0.3495]],
           device='cuda:1')
    (Pdb) self.net(x, mask=input_mask, **kwargs)[:, -1, :]
    tensor([[-0.6792,  0.3940,  0.6685,  ..., -0.5081, -0.4801, -0.2691]],
           device='cuda:1')
    (Pdb) self.net(x, mask=input_mask, **kwargs)[:, -1, :]
    tensor([[-0.6146,  0.4647,  0.8011,  ..., -0.3772, -0.5128, -0.3496]],
           device='cuda:1')
    (Pdb) self.net(x, mask=input_mask, **kwargs)[:, -1, :]
    tensor([[-0.0530, -0.0343,  0.0998,  ...,  0.6310, -0.1682, -0.7353]],
           device='cuda:1')
    (Pdb) self.net(x, mask=input_mask, **kwargs)[:, -1, :]
    tensor([[nan, nan, nan,  ..., nan, nan, nan]], device='cuda:1') <<-- It randomly outputs NaN tensor.
    

Any ideas why this happens?

No fp16 support from fast-transformers (CausalDotProduct)

There is the following RuntimeError: expected scalar type Float but found Half when using fp16 (for causal attention).

That is because the fast-transformers' CausalDotProduct doesn't support fp16. Do you think there is any workaround, because using Float is bad news for memory usage and also disables DeepSpeed ZeRO optimizations?

Show what is the performance on enwiki8 is across your projects

Hello @lucidrains , I´m a very big fan of your work. It is of such as high quality, that every new project you release I get sleepless to try it.

You do have many different versions of transformers, such as reformer, memory-xl, performer... And apparently you already test it with enwiki8.

Would be possible to post on Read-me a table with the enwiki runtime, memory and some performance metric? That would be awesome to compare the different implementations.

Thanks again for your work!!

Triangular matrices ?

Does the current implementation provide triangular matrices (to constrain the attention always on the "left" of the sequence, both for input and encoded values) as described in the last section of the original paper?

is dependency on pytorch-fast-transformers necessary?

I am getting errors installing with pip on mac when the dependency tries to install (log below).
Looking at the code of performer, I don't see imports from pytorch-fast-transformers, so I am wondering if the dependency could not be removed. Did I miss the code that uses it?

pip install pytorch-fast-transformers
Looking in indexes: https://artifactory.prod.adnxs.net/artifactory/api/pypi/pypi/simple, https://artifactory.prod.adnxs.net/artifactory/api/pypi/pypi--remote--pypi/simple
Collecting pytorch-fast-transformers
Using cached https://artifactory.prod.adnxs.net/artifactory/api/pypi/pypi/packages/packages/03/9b/38905999695b381a1e239b91afce219892a23614248fc024e04558f36237/pytorch-fast-transformers-0.3.0.tar.gz (78 kB)
Requirement already satisfied: torch in /Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages (from pytorch-fast-transformers) (1.6.0)
Requirement already satisfied: numpy in /Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages (from torch->pytorch-fast-transformers) (1.18.2)
Requirement already satisfied: future in /Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages (from torch->pytorch-fast-transformers) (0.18.2)
Building wheels for collected packages: pytorch-fast-transformers
Building wheel for pytorch-fast-transformers (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /Users/fcampagne/miniconda3/envs/cqa/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"'; file='"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-wheel-harjez01
cwd: /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/
Complete output (159 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-3.7
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/transformers.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/weight_mapper.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/masking.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/utils.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/causal_linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/improved_clustered_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/clustered_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/improved_clustered_causal_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/reformer_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/local_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/exact_topk_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/conditional_full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/local_product
copying fast_transformers/local_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/local_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/transformer_builders.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/attention_builders.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/base.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/aggregate
copying fast_transformers/aggregate/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/aggregate
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering
copying fast_transformers/clustering/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/transformers.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/_utils.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/hashing
copying fast_transformers/hashing/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/hashing
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/registry.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/spec.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/event.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/event_dispatcher.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/filters.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/fourier_features.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/base.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/causal_product
copying fast_transformers/causal_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/causal_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/sparse_product
copying fast_transformers/sparse_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/sparse_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering/hamming
copying fast_transformers/clustering/hamming/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering/hamming
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention
copying fast_transformers/recurrent/attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
running build_ext
/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py:252: UserWarning:

                             !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (g++) is not compatible with the compiler Pytorch was
built with for this platform, which is clang++ on darwin. Please
use clang++ to to compile your extension. Alternatively, you may
compile PyTorch from source using g++, and then you can also use
g++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                            !! WARNING !!

platform=sys.platform))

building 'fast_transformers.hashing.hash_cpu' extension
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing
Emitting ninja build file /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o.d -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/TH -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/THC -I/Users/fcampagne/miniconda3/envs/cqa/include/python3.7m -c -c /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/fast_transformers/hashing/hash_cpu.cpp -o /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o
c++ -MMD -MF /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o.d -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/TH -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/THC -I/Users/fcampagne/miniconda3/envs/cqa/include/python3.7m -c -c /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/fast_transformers/hashing/hash_cpu.cpp -o /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
clang: error: unsupported option '-fopenmp'
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1515, in _run_ninja_build
env=env)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py", line 209, in
setup_package()
File "/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py", line 204, in setup_package
install_requires=["torch"]
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 204, in run
self.run_command('build')
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 649, in build_extensions
build_ext.build_extensions(self)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
depends=ext.depends)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 478, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects
error_prefix='Error compiling objects for extension')
File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1529, in _run_ninja_build
raise RuntimeError(message)
RuntimeError: Error compiling objects for extension

ERROR: Failed building wheel for pytorch-fast-transformers
Running setup.py clean for pytorch-fast-transformers
Failed to build pytorch-fast-transformers
Installing collected packages: pytorch-fast-transformers
Running setup.py install for pytorch-fast-transformers ... error
ERROR: Command errored out with exit status 1:
command: /Users/fcampagne/miniconda3/envs/cqa/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"'; file='"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-record-mydh3h_8/install-record.txt --single-version-externally-managed --compile --install-headers /Users/fcampagne/miniconda3/envs/cqa/include/python3.7m/pytorch-fast-transformers
cwd: /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/
Complete output (161 lines):
running install
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-3.7
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/transformers.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/weight_mapper.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/masking.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
copying fast_transformers/utils.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/causal_linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/improved_clustered_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/clustered_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/improved_clustered_causal_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/reformer_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/local_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/exact_topk_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/conditional_full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
copying fast_transformers/attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/local_product
copying fast_transformers/local_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/local_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/transformer_builders.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/attention_builders.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
copying fast_transformers/builders/base.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/builders
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/aggregate
copying fast_transformers/aggregate/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/aggregate
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering
copying fast_transformers/clustering/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/transformers.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
copying fast_transformers/recurrent/_utils.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/hashing
copying fast_transformers/hashing/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/hashing
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/registry.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
copying fast_transformers/attention_registry/spec.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/attention_registry
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/event.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/event_dispatcher.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
copying fast_transformers/events/filters.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/events
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/fourier_features.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
copying fast_transformers/feature_maps/base.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/feature_maps
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/causal_product
copying fast_transformers/causal_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/causal_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/sparse_product
copying fast_transformers/sparse_product/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/sparse_product
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering/hamming
copying fast_transformers/clustering/hamming/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/clustering/hamming
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention
copying fast_transformers/recurrent/attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
copying fast_transformers/recurrent/attention/cross_attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/cross_attention
creating build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/full_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/init.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/attention_layer.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
copying fast_transformers/recurrent/attention/self_attention/linear_attention.py -> build/lib.macosx-10.9-x86_64-3.7/fast_transformers/recurrent/attention/self_attention
running build_ext
/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py:252: UserWarning:

                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (g++) is not compatible with the compiler Pytorch was
built with for this platform, which is clang++ on darwin. Please
use clang++ to to compile your extension. Alternatively, you may
compile PyTorch from source using g++, and then you can also use
g++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  platform=sys.platform))
building 'fast_transformers.hashing.hash_cpu' extension
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers
creating /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing
Emitting ninja build file /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o.d -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/TH -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/THC -I/Users/fcampagne/miniconda3/envs/cqa/include/python3.7m -c -c /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/fast_transformers/hashing/hash_cpu.cpp -o /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o
c++ -MMD -MF /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o.d -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/include -arch x86_64 -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/TH -I/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/include/THC -I/Users/fcampagne/miniconda3/envs/cqa/include/python3.7m -c -c /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/fast_transformers/hashing/hash_cpu.cpp -o /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/build/temp.macosx-10.9-x86_64-3.7/fast_transformers/hashing/hash_cpu.o -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=hash_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
clang: error: unsupported option '-fopenmp'
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1515, in _run_ninja_build
    env=env)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py", line 209, in <module>
    setup_package()
  File "/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py", line 204, in setup_package
    install_requires=["torch"]
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/command/install.py", line 61, in run
    return orig.install.run(self)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/install.py", line 545, in run
    self.run_command('build')
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 649, in build_extensions
    build_ext.build_extensions(self)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 478, in unix_wrap_ninja_compile
    with_cuda=with_cuda)
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects
    error_prefix='Error compiling objects for extension')
  File "/Users/fcampagne/miniconda3/envs/cqa/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1529, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error compiling objects for extension
----------------------------------------

ERROR: Command errored out with exit status 1: /Users/fcampagne/miniconda3/envs/cqa/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"'; file='"'"'/private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-install-r1kdymgh/pytorch-fast-transformers/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/fz/lpt1n7hs4r966dkbcp1_jhfc0000gn/T/pip-record-mydh3h_8/install-record.txt --single-version-externally-managed --compile --install-headers /Users/fcampagne/miniconda3/envs/cqa/include/python3.7m/pytorch-fast-transformers Check the logs for full command output.

Results are not deterministic in eval mode

Hi!

It seems that results even in the eval mode are not deterministic at all. It can be explained by the kernel construction procedure described in the paper. But is it a way to get deterministic results?

I use PerformerLm with following parameters:

    self.performer = PerformerLM(
        num_tokens=16000,
        max_seq_len=512,
        dim=4,
        depth=4,
        heads=4,
        causal=True,
        nb_features=None,
        generalized_attention=False,
        reversible=False,
    )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.