Comments (7)
Hi @sshleifer ,
Could you list the steps to reproduce this error? Also please provide environment info.
Thanks,
from fastseq.
Hard to explain the cluster setup, but we fixed with export TORCH_CUDA_ARCH_LIST="6.0;6.1;7.0"
before building the extension.
Another question, is there an advantage for NGramRepeatBlock
inheriting from nn.Module
?
from fastseq.
@sshleifer We are open to pull some changes back into 'fairseq'.
I am trying to use your repeat ngram extension, but when I switch GPUs (without rebuilding the extension) it breaks with
RuntimeError: CUDA error: no kernel image is available for execution on the device
. If I rerun:python setup.py build_ext --inplace
it works again. Any clues how to build the extension so that it works on a different GPU (same cuda version, same python version, same torch) than where it was built?Also, we're considering pulling some of these changes back into
fairseq
, if that's alright with you guys!
from fastseq.
Awesome! If you guys tell me your twitter handles/or some other link I will make sure to credit you when I tweet. The speedup for ngram blocking is really impressive, it will get merged into fairseq/master
soon.
I'm also trying to prioritize including the other changes:
MultiheadAttention
: einsum- SequenceGenerator: parallel post-processing
- BeamSearch: ?
- TransformerEncoder, TransformerModel: delete
reorder_encoder_out
Are the last two changes important? Do you guys have a sense of why?
Is the MultiheadAttention
just to save memory or also faster?
Thanks and sorry for all the questions.
from fastseq.
Thanks Sam, it will be great if you mention our project https://github.com/microsoft/fastseq and twitter @fastseq.
MultiheadAttention
einsum combine withreorder_incremental_state
are both faster and save memory under same batch size. Memory copy takes a lot of time, especially when input is long. Removereorder_encoder_out
because don't need duplicate encoder out by beam size times. There are some analysis in here and here- SequenceGenerator: parallel post-processing.
- BeamSearch: combability change for fairseq v0.9.0 only, just replace torch.div to torch.floor_divide.
from fastseq.
@sshleifer Thanks for your interest! I think this issue has been resolved. I will close it, but feel free to reopen it if you have more questions.
from fastseq.
@sshleifer I saw ngram blocking has merged to fairseq/main
. Do you get chance to try other changes? We have papers (FastSeq and EL-Attention) to description the changes now. It may give more sense how it gives speedup.
from fastseq.
Related Issues (20)
- Does fastseq support cpu HOT 3
- NMT models speedup abnormally related to batch size HOT 2
- Can the fastseq install on windows? HOT 3
- Where to read EL-Attention source code for huggingface-transformers HOT 4
- In which file to read the source code implementation of El-Attention for self-attention HOT 1
- EL-attention GPT-2 HOT 2
- Running error with PyTorch 1.12.1 HOT 2
- fairseq/transformers unit test modify local environment HOT 3
- Transformers unit tests failure
- ACTION REQUIRED: Microsoft needs this private repository to complete compliance info
- ModuleNotFoundError: No module named 'fastseq.models' HOT 2
- fairseq eval_lm HOT 2
- Does it support Tensorflow 2? HOT 3
- Any end-end inference example with Google Colab & HuggingFace HOT 2
- Does it support model seq2seq with encoder, and decoder base on lstm, bi-lstm? HOT 2
- Support for ONNX models & INT8 quantization HOT 2
- Support for HF's transformers 3.1+ HOT 6
- Support for Hugging Face's PEGASUS Model HOT 6
- Support for current fairseq 0.10.2 HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastseq.