kyegomez / andromeda Goto Github PK

An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast

Home Page: https://discord.gg/qUtxnK2NMf

License: GNU General Public License v3.0

Python 98.27% Dockerfile 1.73%

artificial-intelligence deep-learning gpt-4 language-model large-language-models neural-networks transformer agi artificial-general-intelligence artificial-intelligence-algorithms multi-modality multimodal

andromeda's Introduction

Andromeda: Ultra-Fast and Ultra-Intelligent SOTA Language Model 🚀🌌

Welcome to Andromeda, The Fastest, Most Creative, and Reliable Language Model Ever Built, train your own verison, conduct inference, and finetune your own verison with simple plug in and play scripts get started in 10 seconds:

Features

💼 Handle Ultra Long Sequences (32,000-200,000+ context lengths)
⚡ Ultra Fast Processing (32,000+ tokens in under 100ms)
🎓 Superior Reasoning Capabilities

🎯 Principles

Efficiency: Optimize with techniques like attention flashing, rotary position encodings, and deep normalization.
Flexibility: Adapt to various tasks and domains for wide applications.
Scalability: Designed to scale with resources and data sizes.
Community-Driven: Thrives on contributions from the open-source community.

💻 Install

python3.11 -m pip install --upgrade andromeda-torch

Usage

Forward pass with random inputs

import torch

from andromeda.configs import Andromeda1Billion

model = Andromeda1Billion()

x = torch.randint(0, 256, (1, 1024)).cuda()

out = model(x)  # (1, 1024, 20000)
print(out)

Tokenized inputs

from andromeda_torch import Tokenizer
from andromeda_torch.configs import Andromeda1Billion

model = Andromeda1Billion()
tokenizer = Tokenizer()

encoded_text = tokenizer.encode("Hello world!")
out = model(encoded_text)
print(out)

📚 Training

Set the environment variables:
- ENTITY_NAME: Your wandb project name
- OUTPUT_DIR: Directory to save the weights (e.g., ./weights)
- MASTER_ADDR: For distributed training
- MASTER_PORT For master port distributed training
- RANK- Number of nodes services
- WORLD_SIZE Number of gpus
Configure the training:
- Accelerate Config
- Enable Deepspeed 3
- Accelerate launch train_distributed_accelerate.py

For more information, refer to the Training SOP.

Todo

Add Yarn Embeddings from zeta

📈 Benchmarks

Speed

Andromeda utilizes one of the most reliable Attentions ever, flash attention 2.0 Triton. It consumes 50x less memory than GPT-3 and 10x less than LLAMA.

We can speed this up even more with dynamic sparse flash attention 2.0.

License

Apache License

andromeda's People

Contributors

Stargazers

Watchers

Forkers

evannorstrand-mp ivi-42 nerdyburner inayet omarnagy91 papiguy mz0in dapper-magician agora-x wayson20 darcstar-solutions-tech isobe-y luisfertr jithinraj dgo2dance sycomix

andromeda's Issues

[BUG] [BENCHMARKING.PY] RuntimeError: No available kernel. Aborting execution.

: /root/.cache/pip/wheels/20/7b/3f/2807682bad2fba40ed888e6309597a5fda545ab30964c835aa
Successfully built deepspeed
Installing collected packages: tokenizers, SentencePiece, safetensors, ninja, hjson, bitsandbytes, xxhash, rouge, einops, dill, multiprocess, huggingface-hub, transformers, datasets, lion-pytorch, deepspeed, accelerate
Successfully installed SentencePiece-0.1.99 accelerate-0.21.0 bitsandbytes-0.40.2 datasets-2.13.1 deepspeed-0.10.0 dill-0.3.6 einops-0.6.1 hjson-3.1.0 huggingface-hub-0.16.4 lion-pytorch-0.1.2 multiprocess-0.70.14 ninja-1.11.1 rouge-1.0.1 safetensors-0.3.1 tokenizers-0.13.3 transformers-4.30.2 xxhash-3.2.0
[2023-07-17 22:42:48,068] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-07-17 22:42:50.272490: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
A100 GPU detected, using flash attention if input tensor is on cuda
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Memory efficient kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:545.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Memory Efficient attention has been runtime disabled. (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:338.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Flash attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:547.)
  out = F.scaled_dot_product_attention(
/content/Andromeda/Andromeda/optimus_prime/attend.py:168: UserWarning: Both fused kernels do not support non-null attn_mask. (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.h:191.)
  out = F.scaled_dot_product_attention(
Traceback (most recent call last):
  File "/content/Andromeda/benchmarking.py", line 237, in <module>
    forward_pass_time = speed_metrics.forward_pass_time()
  File "/content/Andromeda/benchmarking.py", line 66, in forward_pass_time
    model_input = self.model.decoder.forward(torch.randint(0, 50304, (1, 8192), device=device, dtype=torch.long))[0]
  File "/content/Andromeda/Andromeda/optimus_prime/autoregressive_wrapper.py", line 141, in forward
    logits = self.net(inp, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 1422, in forward
    x = self.attn_layers(x, mask = mask, mems = mems, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 1155, in forward
    out, inter = block(x, mask = mask, context_mask = self_attn_context_mask, attn_mask = attn_mask, rel_pos = self.rel_pos, rotary_pos_emb = rotary_pos_emb, prev_attn = prev_attn, mem = layer_mem)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 581, in forward
    return self.fn(x, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/x_transformers.py", line 863, in forward
    out, intermediates = self.attend(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/Andromeda/Andromeda/optimus_prime/attend.py", line 198, in forward
    return self.flash_attn(q, k, v, mask = mask, attn_bias = attn_bias)
  File "/content/Andromeda/Andromeda/optimus_prime/attend.py", line 168, in flash_attn
    out = F.scaled_dot_product_attention(
RuntimeError: No available kernel.  Aborting execution.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Traceback (most recent call last): File "train_distributed_accelerate.py", line 664, in <module> main() File "train_distributed_accelerate.py", line 569, in main optim, train_loader, lr_scheduler = accelerator.prepare( File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1139, in prepare result = self._prepare_deepspeed(*args) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1381, in _prepare_deepspeed raise ValueError( ValueError: You cannot create a DummyScheduler without specifying a scheduler in the config file.\

Traceback (most recent call last):
File "train_distributed_accelerate.py", line 664, in
main()
File "train_distributed_accelerate.py", line 569, in main
optim, train_loader, lr_scheduler = accelerator.prepare(
File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1139, in prepare
result = self._prepare_deepspeed(*args)
File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1381, in _prepare_deepspeed
raise ValueError(
ValueError: You cannot create a DummyScheduler without specifying a scheduler in the config file.\

Problem installing

just installing with python 3.11

Collecting andromeda-torch
Downloading andromeda_torch-0.0.4-py3-none-any.whl.metadata (6.6 kB)
Collecting SentencePiece (from andromeda-torch)
Using cached sentencepiece-0.1.99-cp311-cp311-macosx_11_0_arm64.whl (1.2 MB)
Collecting accelerate (from andromeda-torch)
Using cached accelerate-0.25.0-py3-none-any.whl.metadata (18 kB)
Collecting datasets (from andromeda-torch)
Using cached datasets-2.15.0-py3-none-any.whl.metadata (20 kB)
Collecting deepspeed (from andromeda-torch)
Downloading deepspeed-0.12.5.tar.gz (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 2.1 MB/s eta 0:00:00a 0:00:010m
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/setup.py", line 38, in
from op_builder.all_ops import ALL_OPS
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/op_builder/all_ops.py", line 29, in
builder = get_accelerator().create_op_builder(member_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/accelerator/mps_accelerator.py", line 223, in create_op_builder
builder_class = self.get_op_builder(op_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/accelerator/mps_accelerator.py", line 230, in get_op_builder
from deepspeed.ops.op_builder.cpu import NotImplementedBuilder
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/init.py", line 21, in
from . import ops
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/init.py", line 6, in
from . import adam
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/adam/init.py", line 6, in
from .cpu_adam import DeepSpeedCPUAdam
File "/private/var/folders/t5/559pccz52qx4d_x668tmzy1c0000gn/T/pip-install-n07wfbss/deepspeed_7e6217180e9a410bb375809b1bef1abb/deepspeed/ops/adam/cpu_adam.py", line 7, in
from cpuinfo import get_cpu_info
ModuleNotFoundError: No module named 'cpuinfo'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Datasets scripts

pile v2 + redpajama is what RKWV is training on rn. that's a 1.7T token dataset.

https://huggingface.co/datasets/bigcode/ta-prompt

Adding Lightning Attention 2 Support

🚀 Feature Request

Using newly proposed model architecture Lightning Attention 2 to increase context size and inference speed.

Motivation

Looks promising and easy to implement, only requires Triton and NVIDIA GPU.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Maybe parametrize tokenizer input max length to rise to Andromeda's magnifcent potential

Not sure if you intended to require texts to be tokenized in 8k chunks

Andromeda/Andromeda/model.py

Line 20 in df7f8d5

model_max_length=8192

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Build Dataset Script Fails

python3 Andromeda/build_dataset.py --seed 42 --seq_len 8192 --hf_account "" --tokenizer "EleutherAI/gpt-neox-20b" --dataset_name "EleutherAI/the_pile_deduplicated"

Traceback (most recent call last):
File "/home/ubuntu/Andromeda/Andromeda/build_dataset.py", line 70, in
built_dataset(args)
File "/home/ubuntu/Andromeda/Andromeda/build_dataset.py", line 17, in built_dataset
tokenizer = AutoTokenizer.from_pretrained(CFG.Tokenizer)
AttributeError: type object 'CFG' has no attribute 'Tokenizer'

Module not found error when running Andromeda.ipynb

ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting einops
Downloading einops-0.6.1-py3-none-any.whl (42 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.2/42.2 kB 3.5 MB/s eta 0:00:00
Installing collected packages: einops
Successfully installed einops-0.6.1

ModuleNotFoundError Traceback (most recent call last)

in <cell line: 19>()
17 from torch.serialization import load
18 import torch
---> 19 from x_transformers import TransformerWrapper, Decoder, AutoregressiveWrapper
20
21 #training

ModuleNotFoundError: No module named 'x_transformers'

NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

kyegomez / andromeda Goto Github PK

andromeda's Introduction

Andromeda: Ultra-Fast and Ultra-Intelligent SOTA Language Model 🚀🌌

Features

🎯 Principles

💻 Install

Usage

📚 Training

Todo

📈 Benchmarks

Speed

License

andromeda's People

Contributors

Stargazers

Watchers

Forkers

andromeda's Issues

Upvote & Fund

Upvote & Fund

🚀 Feature Request

Motivation

Upvote & Fund

Upvote & Fund

Recommend Projects

Recommend Topics

Recommend Org