Giter Club home page Giter Club logo

kyegomez / andromeda Goto Github PK

View Code? Open in Web Editor NEW
121.0 8.0 17.0 67.58 MB

An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast

Home Page: https://discord.gg/qUtxnK2NMf

License: GNU General Public License v3.0

Python 98.27% Dockerfile 1.73%
artificial-intelligence deep-learning gpt-4 language-model large-language-models neural-networks transformer agi artificial-general-intelligence artificial-intelligence-algorithms multi-modality multimodal

andromeda's Introduction

Multi-Modality

Andromeda: Ultra-Fast and Ultra-Intelligent SOTA Language Model ๐Ÿš€๐ŸŒŒ

Open Bounties Rewarded Bounties GitHub issues GitHub forks GitHub stars GitHub license Share on Twitter Share on Facebook Share on LinkedIn Discord Share on Reddit Share on Hacker News Share on Pinterest Share on WhatsApp

Welcome to Andromeda, The Fastest, Most Creative, and Reliable Language Model Ever Built, train your own verison, conduct inference, and finetune your own verison with simple plug in and play scripts get started in 10 seconds:

Features

  • ๐Ÿ’ผ Handle Ultra Long Sequences (32,000-200,000+ context lengths)
  • โšก Ultra Fast Processing (32,000+ tokens in under 100ms)
  • ๐ŸŽ“ Superior Reasoning Capabilities

๐ŸŽฏ Principles

  • Efficiency: Optimize with techniques like attention flashing, rotary position encodings, and deep normalization.
  • Flexibility: Adapt to various tasks and domains for wide applications.
  • Scalability: Designed to scale with resources and data sizes.
  • Community-Driven: Thrives on contributions from the open-source community.

๐Ÿ’ป Install

python3.11 -m pip install --upgrade andromeda-torch

Usage

  • Forward pass with random inputs
import torch

from andromeda.configs import Andromeda1Billion

model = Andromeda1Billion()

x = torch.randint(0, 256, (1, 1024)).cuda()

out = model(x)  # (1, 1024, 20000)
print(out)
  • Tokenized inputs
from andromeda_torch import Tokenizer
from andromeda_torch.configs import Andromeda1Billion

model = Andromeda1Billion()
tokenizer = Tokenizer()

encoded_text = tokenizer.encode("Hello world!")
out = model(encoded_text)
print(out)

๐Ÿ“š Training

  1. Set the environment variables:

    • ENTITY_NAME: Your wandb project name
    • OUTPUT_DIR: Directory to save the weights (e.g., ./weights)
    • MASTER_ADDR: For distributed training
    • MASTER_PORT For master port distributed training
    • RANK- Number of nodes services
    • WORLD_SIZE Number of gpus
  2. Configure the training:

    • Accelerate Config
    • Enable Deepspeed 3
    • Accelerate launch train_distributed_accelerate.py

For more information, refer to the Training SOP.


Todo

  • Add Yarn Embeddings from zeta

๐Ÿ“ˆ Benchmarks

Speed

  • Andromeda utilizes one of the most reliable Attentions ever, flash attention 2.0 Triton. It consumes 50x less memory than GPT-3 and 10x less than LLAMA.

AndromedaBanner

  • We can speed this up even more with dynamic sparse flash attention 2.0.

License

Apache License

andromeda's People

Contributors

kyegomez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

andromeda's Issues

[BUG] [BENCHMARKING.PY] RuntimeError: No available kernel. Aborting execution.

: /root/.cache/pip/wheels/20/7b/3f/2807682bad2fba40ed888e6309597a5fda545ab30964c835aa
Successfully built deepspeed
Installing collected packages: tokenizers, SentencePiece, safetensors, ninja, hjson, bitsandbytes, xxhash, rouge, einops, dill, multiprocess, huggingface-hub, transformers, datasets, lion-pytorch, deepspeed, accelerate
Successfully installed SentencePiece-0.1.99 accelerate-0.21.0 bitsandbytes-0.40.2 datasets-2.13.1 deepspeed-0.10.0 dill-0.3.6 einops-0.6.1 hjson-3.1.0 huggingface-hub-0.16.4 lion-pytorch-0.1.2 multiprocess-0.70.14 ninja-1.11.1 rouge-1.0.1 safetensors-0.3.1 tokenizers-0.13.3 transformers-4.30.2 xxhash-3.2.0
[2023-07-17 22:42:48,068] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-07-17 22:42:50.272490: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
A100 GPU detected, using flash attention if input tensor is on cuda
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Memory efficient kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:545.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Memory Efficient attention has been runtime disabled. (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:338.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Flash attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:547.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Both fused kernels do not support non-null attn_mask. (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:191.)
  out = F.scaled_dot_product_attention(
Traceback (most recent call last):
  File "/content/Andromeda/benchmarking.py", line 237, in <module>
    forward_pass_time = speed_metrics.forward_pass_time()
  File "/content/Andromeda/benchmarking.py", line 66, in forward_pass_time
    model_input = self.model.decoder.forward(torch.randint(0, 50304, (1, 8192), device=device, dtype=torch.long))[0]
  File "/content/Andromeda/Andromeda/optimus_prime/autoregressive_wrapper.py", line 141, in forward
    logits = self.net(inp, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 1422, in forward
    x = self.attn_layers(x, mask = mask, mems = mems, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 1155, in forward
    out, inter = block(x, mask = mask, context_mask = self_attn_context_mask, attn_mask = attn_mask, rel_pos = self.rel_pos, rotary_pos_emb = rotary_pos_emb, prev_attn = prev_attn, mem = layer_mem)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 581, in forward
    return self.fn(x, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 863, in forward
    out, intermediates = self.attend(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/attend.py", line 198, in forward
    return self.flash_attn(q, k, v, mask = mask, attn_bias = attn_bias)
  File "/content/Andromeda/Andromeda/optimus_prime/attend.py", line 168, in flash_attn
    out = F.scaled_dot_product_attention(
RuntimeError: No available kernel.  Aborting execution.

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

Traceback (most recent call last): File "train_distributed_accelerate.py", line 664, in <module> main() File "train_distributed_accelerate.py", line 569, in main optim, train_loader, lr_scheduler = accelerator.prepare( File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1139, in prepare result = self._prepare_deepspeed(*args) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1381, in _prepare_deepspeed raise ValueError( ValueError: You cannot create a DummyScheduler without specifying a scheduler in the config file.\

Traceback (most recent call last):
File "train_distributed_accelerate.py", line 664, in
main()
File "train_distributed_accelerate.py", line 569, in main
optim, train_loader, lr_scheduler = accelerator.prepare(
File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1139, in prepare
result = self._prepare_deepspeed(*args)
File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1381, in _prepare_deepspeed
raise ValueError(
ValueError: You cannot create a DummyScheduler without specifying a scheduler in the config file.\

Problem installing

just installing with python 3.11

Collecting andromeda-torch
Downloading andromeda_torch-0.0.4-py3-none-any.whl.metadata (6.6 kB)
Collecting SentencePiece (from andromeda-torch)
Using cached sentencepiece-0.1.99-cp311-cp311-macosx_11_0_arm64.whl (1.2 MB)
Collecting accelerate (from andromeda-torch)
Using cached accelerate-0.25.0-py3-none-any.whl.metadata (18 kB)
Collecting datasets (from andromeda-torch)
Using cached datasets-2.15.0-py3-none-any.whl.metadata (20 kB)
Collecting deepspeed (from andromeda-torch)
Downloading deepspeed-0.12.5.tar.gz (1.2 MB)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.2/1.2 MB 2.1 MB/s eta 0:00:00a 0:00:010m
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

ร— python setup.py egg_info did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> [22 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/setup.py", line 38, in
from op_builder.all_ops import ALL_OPS
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/op_builder/all_ops.py", line 29, in
builder = get_accelerator().create_op_builder(member_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/accelerator/mps_accelerator.py", line 223, in create_op_builder
builder_class = self.get_op_builder(op_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/accelerator/mps_accelerator.py", line 230, in get_op_builder
from deepspeed.ops.op_builder.cpu import NotImplementedBuilder
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/init.py", line 21, in
from . import ops
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/init.py", line 6, in
from . import adam
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/adam/init.py", line 6, in
from .cpu_adam import DeepSpeedCPUAdam
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/adam/cpu_adam.py", line 7, in
from cpuinfo import get_cpu_info
ModuleNotFoundError: No module named 'cpuinfo'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

Adding Lightning Attention 2 Support

๐Ÿš€ Feature Request

Using newly proposed model architecture Lightning Attention 2 to increase context size and inference speed.

Motivation

Looks promising and easy to implement, only requires Triton and NVIDIA GPU.

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

Build Dataset Script Fails

python3 Andromeda/build_dataset.py --seed 42 --seq_len 8192 --hf_account "" --tokenizer "EleutherAI/gpt-neox-20b" --dataset_name "EleutherAI/the_pile_deduplicated"

Traceback (most recent call last):
File "/home/ubuntu/Andromeda/Andromeda/build_dataset.py", line 70, in
built_dataset(args)
File "/home/ubuntu/Andromeda/Andromeda/build_dataset.py", line 17, in built_dataset
tokenizer = AutoTokenizer.from_pretrained(CFG.Tokenizer)
AttributeError: type object 'CFG' has no attribute 'Tokenizer'

Module not found error when running Andromeda.ipynb

ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting einops
Downloading einops-0.6.1-py3-none-any.whl (42 kB)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 42.2/42.2 kB 3.5 MB/s eta 0:00:00
Installing collected packages: einops
Successfully installed einops-0.6.1


ModuleNotFoundError Traceback (most recent call last)

in <cell line: 19>()
17 from torch.serialization import load
18 import torch
---> 19 from x_transformers import TransformerWrapper, Decoder, AutoregressiveWrapper
20
21 #training

ModuleNotFoundError: No module named 'x_transformers'


NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.