Giter Club home page Giter Club logo

xtuner's Introduction



GitHub Repo stars license PyPI Downloads issue resolution open issues

๐Ÿ‘‹ join us on Static Badge Static Badge Static Badge

๐Ÿ” Explore our models on Static Badge Static Badge Static Badge Static Badge

English | ็ฎ€ไฝ“ไธญๆ–‡

๐Ÿš€ Speed Benchmark

  • Llama2 7B Training Speed
  • Llama2 70B Training Speed

๐ŸŽ‰ News

  • [2024/04] LLaVA-Phi-3-mini is released! Click here for details!
  • [2024/04] LLaVA-Llama-3-8B and LLaVA-Llama-3-8B-v1.1 are released! Click here for details!
  • [2024/04] Support Llama 3 models!
  • [2024/04] Support Sequence Parallel for enabling highly efficient and scalable LLM training with extremely long sequence lengths! [Usage] [Speed Benchmark]
  • [2024/02] Support Gemma models!
  • [2024/02] Support Qwen1.5 models!
  • [2024/01] Support InternLM2 models! The latest VLM LLaVA-Internlm2-7B / 20B models are released, with impressive performance!
  • [2024/01] Support DeepSeek-MoE models! 20GB GPU memory is enough for QLoRA fine-tuning, and 4x80GB for full-parameter fine-tuning. Click here for details!
  • [2023/12] ๐Ÿ”ฅ Support multi-modal VLM pretraining and fine-tuning with LLaVA-v1.5 architecture! Click here for details!
  • [2023/12] ๐Ÿ”ฅ Support Mixtral 8x7B models! Click here for details!
  • [2023/11] Support ChatGLM3-6B model!
  • [2023/10] Support MSAgent-Bench dataset, and the fine-tuned LLMs can be applied by Lagent!
  • [2023/10] Optimize the data processing to accommodate system context. More information can be found on Docs!
  • [2023/09] Support InternLM-20B models!
  • [2023/09] Support Baichuan2 models!
  • [2023/08] XTuner is released, with multiple fine-tuned adapters on Hugging Face.

๐Ÿ“– Introduction

XTuner is an efficient, flexible and full-featured toolkit for fine-tuning large models.

Efficient

  • Support LLM, VLM pre-training / fine-tuning on almost all GPUs. XTuner is capable of fine-tuning 7B LLM on a single 8GB GPU, as well as multi-node fine-tuning of models exceeding 70B.
  • Automatically dispatch high-performance operators such as FlashAttention and Triton kernels to increase training throughput.
  • Compatible with DeepSpeed ๐Ÿš€, easily utilizing a variety of ZeRO optimization techniques.

Flexible

  • Support various LLMs (InternLM, Mixtral-8x7B, Llama 2, ChatGLM, Qwen, Baichuan, ...).
  • Support VLM (LLaVA). The performance of LLaVA-InternLM2-20B is outstanding.
  • Well-designed data pipeline, accommodating datasets in any format, including but not limited to open-source and custom formats.
  • Support various training algorithms (QLoRA, LoRA, full-parameter fune-tune), allowing users to choose the most suitable solution for their requirements.

Full-featured

  • Support continuous pre-training, instruction fine-tuning, and agent fine-tuning.
  • Support chatting with large models with pre-defined templates.
  • The output models can seamlessly integrate with deployment and server toolkit (LMDeploy), and large-scale evaluation toolkit (OpenCompass, VLMEvalKit).

๐Ÿ”ฅ Supports

Models SFT Datasets Data Pipelines Algorithms

๐Ÿ› ๏ธ Quick Start

Installation

  • It is recommended to build a Python-3.10 virtual environment using conda

    conda create --name xtuner-env python=3.10 -y
    conda activate xtuner-env
  • Install XTuner via pip

    pip install -U xtuner

    or with DeepSpeed integration

    pip install -U 'xtuner[deepspeed]'
  • Install XTuner from source

    git clone https://github.com/InternLM/xtuner.git
    cd xtuner
    pip install -e '.[all]'

Fine-tune

XTuner supports the efficient fine-tune (e.g., QLoRA) for LLMs. Dataset prepare guides can be found on dataset_prepare.md.

  • Step 0, prepare the config. XTuner provides many ready-to-use configs and we can view all configs by

    xtuner list-cfg

    Or, if the provided configs cannot meet the requirements, please copy the provided config to the specified directory and make specific modifications by

    xtuner copy-cfg ${CONFIG_NAME} ${SAVE_PATH}
    vi ${SAVE_PATH}/${CONFIG_NAME}_copy.py
  • Step 1, start fine-tuning.

    xtuner train ${CONFIG_NAME_OR_PATH}

    For example, we can start the QLoRA fine-tuning of InternLM2-Chat-7B with oasst1 dataset by

    # On a single GPU
    xtuner train internlm2_chat_7b_qlora_oasst1_e3 --deepspeed deepspeed_zero2
    # On multiple GPUs
    (DIST) NPROC_PER_NODE=${GPU_NUM} xtuner train internlm2_chat_7b_qlora_oasst1_e3 --deepspeed deepspeed_zero2
    (SLURM) srun ${SRUN_ARGS} xtuner train internlm2_chat_7b_qlora_oasst1_e3 --launcher slurm --deepspeed deepspeed_zero2
    • --deepspeed means using DeepSpeed ๐Ÿš€ to optimize the training. XTuner comes with several integrated strategies including ZeRO-1, ZeRO-2, and ZeRO-3. If you wish to disable this feature, simply remove this argument.

    • For more examples, please see finetune.md.

  • Step 2, convert the saved PTH model (if using DeepSpeed, it will be a directory) to Hugging Face model, by

    xtuner convert pth_to_hf ${CONFIG_NAME_OR_PATH} ${PTH} ${SAVE_PATH}

Chat

XTuner provides tools to chat with pretrained / fine-tuned LLMs.

xtuner chat ${NAME_OR_PATH_TO_LLM} --adapter {NAME_OR_PATH_TO_ADAPTER} [optional arguments]

For example, we can start the chat with

InternLM2-Chat-7B with adapter trained from oasst1 dataset:

xtuner chat internlm/internlm2-chat-7b --adapter xtuner/internlm2-chat-7b-qlora-oasst1 --prompt-template internlm2_chat

LLaVA-InternLM2-7B:

xtuner chat internlm/internlm2-chat-7b --visual-encoder openai/clip-vit-large-patch14-336 --llava xtuner/llava-internlm2-7b --prompt-template internlm2_chat --image $IMAGE_PATH

For more examples, please see chat.md.

Deployment

  • Step 0, merge the Hugging Face adapter to pretrained LLM, by

    xtuner convert merge \
        ${NAME_OR_PATH_TO_LLM} \
        ${NAME_OR_PATH_TO_ADAPTER} \
        ${SAVE_PATH} \
        --max-shard-size 2GB
  • Step 1, deploy fine-tuned LLM with any other framework, such as LMDeploy ๐Ÿš€.

    pip install lmdeploy
    python -m lmdeploy.pytorch.chat ${NAME_OR_PATH_TO_LLM} \
        --max_new_tokens 256 \
        --temperture 0.8 \
        --top_p 0.95 \
        --seed 0

    ๐Ÿ”ฅ Seeking efficient inference with less GPU memory? Try 4-bit quantization from LMDeploy! For more details, see here.

Evaluation

  • We recommend using OpenCompass, a comprehensive and systematic LLM evaluation library, which currently supports 50+ datasets with about 300,000 questions.

๐Ÿค Contributing

We appreciate all contributions to XTuner. Please refer to CONTRIBUTING.md for the contributing guideline.

๐ŸŽ–๏ธ Acknowledgement

๐Ÿ–Š๏ธ Citation

@misc{2023xtuner,
    title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
    author={XTuner Contributors},
    howpublished = {\url{https://github.com/InternLM/xtuner}},
    year={2023}
}

License

This project is released under the Apache License 2.0. Please also adhere to the Licenses of models and datasets being used.

xtuner's People

Contributors

lzhgrla avatar hit-cwh avatar pppppm avatar xiaohangguo avatar koosung avatar amulil avatar humu789 avatar tpoisonooo avatar hhaandroid avatar eltociear avatar jianxindong avatar kevinnunu avatar lkjacky avatar pommespeter avatar rangilyu avatar dumoedss avatar liuyanyi avatar del-zhenwu avatar fanqino1 avatar gzlong96 avatar maxchiron avatar ooooo-create avatar ajupyter avatar kmno4-zx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.