Comments (5)
Hmmm it's the first time I see this being reported here. Would it be possible for you to try with a different torch
version to see if you still have the error?
from transformers.
Hey @LysandreJik, thanks for your answer
Yes so I checked pytorch and it seems the issue occurs at version 2.1.0
:
transformers
2.43.0
andtorch
2.0.1
is oktransformers
2.43.0
andtorch
2.1.0.
gives me bus error
(transformers 4.42.0 is still ok for all torch versions)
I did not dig further into torch changes though, let me know
from transformers.
Interesting, it might be good to open an issue on the PyTorch slack in that case
from transformers.
I will check in this direction, thanks (a bit off topic, but is the slack available on invite only ?)
Also if this helps anyone at some point, it seems the hidden layer outputs seems to explode to full nan
at some point before being sent into the projection layer where the bus error occurs
from transformers.
Sorry I meant the PyTorch Github 🤦
from transformers.
Related Issues (20)
- ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation HOT 1
- TypeError: '<' not supported between instances of 'NoneType' and 'int' HOT 2
- Can’t train Mamba2 with FP16 (Mamba(/2)ForCausalLM) HOT 1
- batch inference scales linearly with batch size when input is long HOT 1
- Exception raised with trainer + `accelerate launch` FSDP + large gradient accumulation steps + small dataset
- Do Transformers onnx export support the input of the Llama is the input_embeds? HOT 1
- Cannot batch them ({'num_frames', 'input_features', 'is_last'} != {'input_features', 'is_last'}) HOT 1
- The examples in the examples directory are mostly for fine-tuning pre-trained models?how to trian from scratch
- `gpt2` with `output_attentions=True` has different attentions shape between flash and eager
- Try to convert LlamaForCausalLM to ONNX with input_embeds as input
- eval_loss not found when training a peft model using trainer.py HOT 2
- documents not being applied in apply_chat_tempplate HOT 5
- FileNotFoundError: Unable to Create Symlink in `cache_dir` - No Such File or Directory HOT 2
- Default setting of GenerationConfig
- `Zero-shot object detection` documentation sentence rephrase
- Usage of deprecated task will break CI after datasets-3.0
- Character sets not supported on windows11 HOT 10
- Support Pixtral
- [Distilbert] Torch jit trace failed with `load_in_8bit=True`.
- RuntimeError when performing Mask2Former traced model inference on a different device
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.