Giter Club home page Giter Club logo

abci-llm-distributed-training-hackathon-01's Introduction

レポジトリのセットアップ

git clone https://github.com/shunk031/abci-llm-distributed-training-hackathon-01
cd /path/to/abci-llm-distributed-training-hackathon-01

Python 環境の構築

  • ABCI プリインストールモジュールの読み込み
module load python/3.10 cuda/11.7 cudnn/8.6

module list
# Currently Loaded Modulefiles:
#  1) python/3.10/3.10.10   2) cuda/11.7/11.7.1   3) cudnn/8.6/8.6.0
  • python 環境の構築
python3 -m venv .venv
source .venv/bin/activate

pip install -U pip wheel setuptools
pip install ruff black mypy

mosaicml/llm-foundry のインストール

  • mosaicml/llm-foundry を clone
git clone https://github.com/mosaicml/llm-foundry
cd llm-foundry

# Clone したときの commit hash を確認
git show --format="%H" --no-patch
# ef350d9e64d13cb1db35ab7941bf9039b1b499fd
  • mosaicml/llm-foundry をインストール
pip install cmake packaging torch
pip install -e ".[gpu]" # 結構時間かかります
pip install git+https://github.com/mosaicml/composer.git@dev

ジョブを投入

export GROUP=XXXXXXXXXX
export WANDB_API_KEY=XXXXXXXXXX

cd /path/to/abci-llm-distributed-training-hackathon-01

qsub -g $GROUP scripts/exp03.sh

モデルの種類

  • exp02.sh: MPT-7B 用
  • exp03.sh: MPT-30B 用

学習結果

wandb から確認できます:

abci-llm-distributed-training-hackathon-01's People

Contributors

shunk031 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.