I tried to reproduce zephyr-7b-gemma-v0.1 using the e

Cannot reproduce zephyr-7b-gemma-v0.1 about alignment-handbook HOT 3 CLOSED

jasonyux commented on June 2, 2024

Cannot reproduce zephyr-7b-gemma-v0.1

from alignment-handbook.

Comments (3)

jasonyux commented on June 2, 2024 1

This relates from how the model is trained using the run_dpo.py script. In that script, chat data is first formatted using tokenizer's template and then fed into Trainer. Unless you use (maybe) the latest version of fschat (which uses hardcoded templates), fschat will not use that same template; which leads to performance degradation.

from alignment-handbook.

jasonyux commented on June 2, 2024

It seems that the issue is with chat templates used by fastchat during evaluation. Using the following templates to test H4's gemma models recovers the reported performance:

from fastchat.conversation import register_conv_template

register_conv_template(
    Conversation(
        name="templ=h4_gemma_chatml",
        system_template="<bos><|im_start|>system\n{system_message}",
        system_message="You are an AI assistant.",
        roles=("<|im_start|>user", "<|im_start|>assistant"),
        sep_style=SeparatorStyle.CHATML,
        sep="<|im_end|>",
        stop_str=["<|im_end|>", "<|endoftext|>"],
    )
)

# other init code omitted

from alignment-handbook.

fanconic commented on June 2, 2024

May I ask where this template originates from?

from alignment-handbook.

Cannot reproduce zephyr-7b-gemma-v0.1 about alignment-handbook HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent