Comments (4)
Yes, you're right about the cause! Rather than trying to merge a proper chat template for Blenderbot (which is very obsolete by now), I'll just rewrite the doc to use a different model.
from transformers.
@Rocketknight1 I'm getting the same error when I try to use some models like gemma. I can try to use the template parameter but not sure what the format is for gemma model (I can look it up in the tokenizer_config.json right?). Is this pretty much what we now have to do when we get this error? manually set the template? for models that dont accept "role":"system"
what wwould be the work around?
from transformers.
Hi @PhilipAmadasun, the most likely cause is that you're loading the base gemma models, like gemma-2-2b
, instead of the models that are "instruction tuned" for chat, like gemma-2-2b-it
. The base models are just simple language models and don't support chat, and therefore don't have a chat template. If you use a model trained for chat, it should work!
from transformers.
Also @NielsRogge, fix has now been merged
from transformers.
Related Issues (20)
- I am sure I've made a mistake, but confused about the behavior of ``attention_mask'' of the roformer. HOT 2
- can't resume lora training due to wandb logging num params
- Phi-3's tokenizer `.convert_ids_to_tokens` returns bytes instead of a string HOT 1
- The forced adoption of low_cpu_mem_usage behavior ruins models that require absolute position embedding. HOT 2
- Phi-3 maximum recursion depth exceeded when execute AutoModelForCausalLM.from_pretrained HOT 2
- for LlaVA output of image_hidden_states not possible? HOT 1
- Bug with finetuning Gemma 2 models HOT 3
- DINOV2 generate different outputs for the same inputs HOT 6
- Can't run docker because of flash attention 2? HOT 1
- Training Resumes with Increased Loss Despite Checkpoint Loading HOT 1
- How do LLMs identify generation start point during fine-tuning? HOT 1
- [Nougat] image after NougatProcessor leads to Traceback
- Add "EAT: Self-Supervised Pre-Training with Efficient Audio Transformer"
- How to install transformers==4.45, two or three days I can install successfully, but today cannot. HOT 2
- Accelerate x Trainer issue tracker: HOT 7
- ImportError: attempted relative import with no known parent package HOT 2
- List of Typos in the `transformers[en]` Docs
- bus error on version 4.43.0 with pretrained community CLIP model - MacOS HOT 5
- Can't save quantized models HOT 7
- [Docs] How to build offline HTML or Docset files for other documentation viewers? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.