ChatFish, an open-source Chinese chatbot trained by fine-tuning Bloom on blendered conversation datasets. This Repo builds Web UI for ChatFish powered by Gradio.
git clone https://github.com/LZYSaltedFish/ChatFish-Chatbot.git
cd ChatFish-Chatbot
pip install -r requirements.txt
cd inference
sh chat.sh
Trained with DeepSpeed-Chat, on 8 16G-V100 GPUs. Full finetuned with ZeRO stage 2, no LoRA.
- Bloom-1b1: base model of ChatFish
- chatfish-1b1-sft: finetuned chatbot model
Data used for training are extracted from the following open-source dataset.
Dataset | Size | Avg turns | Used |
---|---|---|---|
Guanaco | 200K | 2.7 | 66K |
Vicuna-ShareGPT | 6K | 5.9 | 3.5K |
GPT4-LLM | 49K | 1 | 33K |
MOSS-002-SFT | 590K | 2.9 | 211K |
InstructWild | 51K | 1 | 45K |
Instances are simply filtered by rules to meet the requirements of:
- length of response no shorter than 5 tokens.
- total length of query and response no shorter than 128 tokens.
- each query has one and only one response.
- chinese data.
- split multiturn conversation into multiple instances, with history context at the beginning of the query.
name | value |
---|---|
batch_size | 1 |
max_seq_len | 1024 |
lr | 9.65e-6 |
epoch | 15 |
lr_scheduler | cosine |
warm_up | 1000 |
- lack of methematical and complex reasoning ability.
- lack of truthfulness, prone to hallucinations.
- lack of hramlessness.
- chinese only.
- lack of coding ability, the codes generated usually contains errors.