Comments (4)
感谢您对我们工作的关注。
实际上,我们在tokenizer_config.json
的chat_template
中已经预留了System Prompt的位置:
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{{ bos_token }}{% for message in messages %}{% if message['role'] == 'user' %}{{ 'User: ' + message['content'] + '\n\n' }}{% elif message['role'] == 'assistant' %}{{ 'Assistant: ' + message['content'] + eos_token }}{% elif message['role'] == 'system' %}{{ message['content'] + '\n\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ 'Assistant:' }}{% endif %}"
在使用时需要保证system消息唯一且位于第一条的位置就能得到符合我们预期的结果。
from deepseek-llm.
感谢您对我们工作的关注。
实际上,我们在
tokenizer_config.json
的chat_template
中已经预留了System Prompt的位置:"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{{ bos_token }}{% for message in messages %}{% if message['role'] == 'user' %}{{ 'User: ' + message['content'] + '\n\n' }}{% elif message['role'] == 'assistant' %}{{ 'Assistant: ' + message['content'] + eos_token }}{% elif message['role'] == 'system' %}{{ message['content'] + '\n\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ 'Assistant:' }}{% endif %}"
在使用时需要保证system消息唯一且位于第一条的位置就能得到符合我们预期的结果。
在coder 33B这个模型上其实也有实验过:
你们的found_item参数似乎无法起作用。不管我的system是否设置了,模型还是会将默认的“you are a AI programming assistant ···”这个提示词一起输出。
虽然我也注意到你们更新了tokenizer_config.json,里面抛弃了这个参数。
from deepseek-llm.
在包括Tokenizer在内的部分实现细节上,chat和coder模型并不完全一致。使用不同模型时请以对应模型的chat_template
为准。在chat模型中并没有默认指定System Prompt,如有需要可以自行在消息列表中添加。
from deepseek-llm.
在包括Tokenizer在内的部分实现细节上,chat和coder模型并不完全一致。使用不同模型时请以对应模型的
chat_template
为准。在chat模型中并没有默认指定System Prompt,如有需要可以自行在消息列表中添加。
我的意思是:在之前coder对应的chat_template
里,带了一个叫做found_item
的参数,以此来控制System Prompt。
在此基础上,我发现他确实能把用户自定义的System Prompt
信息带进去,但是仍然会输出你们预先设置的提示词
“you are a AI programming assistant ···”
。这意味着两个system prompt。
然后很遗憾,用户的这个并没有起到作用。
按你们这个模板的话,预设的这个提示词应该不显示才对。
from deepseek-llm.
Related Issues (20)
- 67B-Instructor – will it be released shortly/ever? HOT 1
- Will finetune scripts be provided? HOT 1
- Programming Language in LeetCode Weekly Contest HOT 3
- Inquiry about Prompt Engineering and Handling Toxicity/Hallucination
- Missing files in released pretrain ckpts HOT 1
- AlignBench测评结果复现求助 HOT 2
- TriviaQA结果复现求助 HOT 4
- AWS CLI 使用问题与 deepseek-ai S3 桶访问问题 HOT 1
- Training data distribution HOT 1
- 关于vllm使用的疑问 HOT 1
- 请问LLM和coder的base model结构是一样的吗?还是有什么区别呢? HOT 1
- Deepseek SFT数据包含system应该如何处理? HOT 1
- Scaling laws data HOT 1
- Could you please release intermediate pretraining checkpoints at HuggingFace?
- Deepseek VL? HOT 1
- 关于模型指标有一些疑问 HOT 1
- Humaneval, use base model or instruct finetuned model? HOT 1
- 贵团队是否会升级长上下文的版本? HOT 1
- Is the compute calculation wrong for Chinchilla in the paper? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepseek-llm.