Giter Club home page Giter Club logo

chat_templates's Introduction

chat_templates

This is a repository that includes proper chat templates (or input formats) for large language models (LLMs), to support transformers's chat_template feature.

We know that different models are trained with different input formats, especially for those instruction-tuned or chat models. This is especially noted in transformers's new chat_template feature. However, I found that popular models (e.g., vicuna, falcon) on HuggingFace do not include this parameter in their tokenizer_config.json files, which may make it troublesome to properly run these models. Also, the chat_template feature requires to implement a Jinja template, which may be not intuitive to be directly done in the json files.

So I collect proper chat templates of several popular models from official reference or implementations, which are put under chat_templates. If you are interested to include more chat templates, feel free to open a pull request.

If you find this repo useful, please kindly cite it:

@misc{zheng-2024-chat-templates,
  author = {Zheng, Chujie},
  title = {Chat Templates for HuggingFace Large Language Models},
  year = {2024},
  howpublished = {\url{https://github.com/chujiezheng/chat_templates}}
}

Updates

  • [05/2024] Added support for Nvidia's ChatQA models
  • [04/2024] Added support for Microsoft's Phi-3 models
  • [04/2024] Added support for Meta's Llama-3 models
  • [02/2024] Added support for Google's Gemma models
  • [02/2024] Added usage explanation for generation_configs
  • [01/2024] Added support for Alibaba's Qwen2 models

What are Contained in This Repo?

  • chat_templates contains the jinja files of collected chat templates, which can be directly replaced in the Huggingface tokenizers.

  • generation_configs contains the corresponding json configs used for controlling the ending of response generations. Specially, the stop_token_ids should be directly passed into the generate method by the eos_token_id argument.

Supported Models

Model (Family) Template File Reference Comment
llama-3-instruct New llama-3-instruct.jinja link Official template
Meta-Llama-3-8B/70B-Instruct
qwen2-chat New chatml.jinja link ChatML format
Qwen1.5-0.4B/1.8B/4B/7B/14B/72B-Chat
mistral-instruct New mistral-instruct.jinja link Mistral-7B-Instruct-v0.2/0.3
System message allowed
phi-3 New phi-3.jinja link Official template
Phi-3-mini-4k/128k-instruct
gemma-it New gemma-it.jinja link gemma-2b/7b-it
System message allowed
chatqa New chatqa.jinja link Llama3-ChatQA-1.5-8B/70B
Context message allowed
llama-2-chat llama-2-chat.jinja link Official template
Llama-2-7b/13b/70b-chat-hf
mistral-instruct-v0.1 mistral-instruct-v0.1.jinja link Mistral-7B-Instruct-v0.1
System message allowed
openchat openchat.jinja link openchat-3.5
zephyr zephyr.jinja link zephyr-7b-alpha/beta
yi-chat chatml.jinja link ChatML format
Yi-6B/34B-Chat
orca-2 chatml.jinja link ChatML format
Orca-2-7b/13b
vicuna vicuna.jinja link vicuna-7b/13b-v1.5
falcon-instruct falcon-instruct.jinja link falcon-7b/40b-instruct
starling-lm openchat.jinja link Starling-LM-7B-alpha/beta
solar-instruct solar-instruct.jinja link SOLAR-10.7B-Instruct-v1.0
alpaca alpaca.jinja link alpaca-style models, like Platypus2-13B
amberchat amberchat.jinja link AmberChat, AmberSafe
saiga saiga.jinja link saiga, a series of Russian models

Note: mistral-instruct-v0.1 is slightly different from mistral-instruct (for v0.2/0.3)

Examples of Setting chat_template

Important Note: As mentioned in this issue, the messages should contain at least one user message. It is strongly not recommented to pass only the system message, as there may result in unexpected outputs (because the models are not trained in this way).

Example 1: llama-3-instruct

This example may check if the jinja file is correctly implemented.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", token="YOUR_OWN_TOKEN")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (yet Correct) Chat Template ######')
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/llama-3-instruct.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (yet Correct) Chat Template ######
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

This is a system prompt.<|eot_id|><|start_header_id|>user<|end_header_id|>

This is the first user input.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

This is the first assistant response.<|eot_id|><|start_header_id|>user<|end_header_id|>

This is the second user input.<|eot_id|><|start_header_id|>assistant<|end_header_id|>


###### Corrected Chat Template ######
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

This is a system prompt.<|eot_id|><|start_header_id|>user<|end_header_id|>

This is the first user input.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

This is the first assistant response.<|eot_id|><|start_header_id|>user<|end_header_id|>

This is the second user input.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Example 2: llama-2-chat

This example may check if the jinja file is correctly implemented.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token="YOUR_OWN_TOKEN")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (yet Correct) Chat Template ######')
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/llama-2-chat.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (yet Correct) Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]
###### Corrected Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]

Example 3: mistral-instruct

For mistral-instruct (also gemma-it), it does not natively support the system message, so passing the system message would raise error.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("lmsys/vicuna-7b-v1.5")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (but Improper) Chat Template ######')
# raising error
#print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/mistral-instruct.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (but Error-Raising) Chat Template ######
jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/...
###### Corrected Chat Template ######
<s>[INST] This is a system prompt.

This is the first user input. [/INST] This is the first assistant response. </s>[INST] This is the second user input. [/INST]

Example 4: vicuna

NOTE: In fast-chat, vicuna does not add linebreaks between roles' messages. But I found that adding linebreaks leads to a bit better performance (especially for the v1.5 version).

Also, I found vicuna-7/13/33b-v1.3 may not work well when given a system message different from its default one. So I would recommend to use vicuna-7/13b-v1.5 instead.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("lmsys/vicuna-7b-v1.5")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (but Improper) Chat Template ######')
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/vicuna.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (but Improper) Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]
###### Corrected Chat Template ######
<s>This is a system prompt.

USER: This is the first user input.
ASSISTANT: This is the first assistant response.</s>
USER: This is the second user input.
ASSISTANT:

Star History

Star History Chart

chat_templates's People

Contributors

chujiezheng avatar dimaioksha avatar alvant avatar nestordemeure avatar nayohan avatar garyxcj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.