eleutherai / cookbook Goto Github PK

View Code? Open in Web Editor NEW

657.0 657.0 33.0 54.62 MB

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Home Page: https://www.eleuther.ai/

License: Apache License 2.0

Python 86.23% Makefile 0.04% C++ 11.76% C 0.80% Cuda 1.04% Jupyter Notebook 0.13% Shell 0.01%

cookbook's People

Contributors

Stargazers

Watchers

cookbook's Issues

requirements.txt or versions ?

can we get versions of used libraries to run the benchmarks ?

Improve Handling of Llama-style Models

While the calc scripts are correct for llama-style models, their implementation is inflexible (see #36 and #35)

It'd be nice to clean this up a bit.

Fix webpage hosting

Currently the model directory webpage at https://github.com/EleutherAI/cookbook/tree/main/model-directory isn't live and entirely undocumented.

Make model directory webpage live
Add model hparam setting html page and make it live
Add minimal documentation and contributor guidelines to both

Based on hidden_size and num_layers the flop calcuation is baseless .I feel like it is mapping a certain value to a certain number just like permutations and combinations.
i want to calculate this for llm's based on the model chosen atleast some appropriately!!

Add Inference FLOP Calculation

As recently pointed out in https://arxiv.org/abs/2401.00448, inference FLOPs are also important and it would be easy to add a flag to https://github.com/EleutherAI/cookbook/blob/main/calc/calc_transformer_flops.py for the inference and training+inference cases.

[Question] Does the model param include the `lm-head`?

calc_transformer_mem.py is inaccurate for most popular open models

Running calc_transformer_mem.py with the parameters for Qwen1.5-72B prints that this model has 56.19 billion parameters, while the real number is around 72 billion:

python calc_transformer_mem.py --infer --high-prec-bytes-per-val 4 --low-prec-bytes-per-val 1 --num-gpus 2 --zero-stage 3 -ca -b 1 -s 1024 -v 152064 -hs 8192 -a 64 -l 80 -kv 1 -ff 3

My guess this is because the script assumes two linear layers per MLP block, while most popular open source models like Llama, Mixtral, Qwen, etc. have three:

https://github.com/huggingface/transformers/blob/6e584070d4f86964c4268baed08a5a5da8f82633/src/transformers/models/llama/modeling_llama.py#L240

(Also, the --ffn-expansion-factor flag requires an integer, while for Llama-2-70B I believe it's 3.5? --low-prec-bytes-per-val will also be less than 1 for quantized models.)

I/O Benchmarking

Would be good to add I/O benchmarks in the style of existing communication and computation benchmarks.

Benchmarking

so to benchmark llm's with huge number of parameters we need to have the file locally so as to pass it as hostfile.
Is there any other way so it can fetch automatically from hugging face and give predict the latency?

Inference FLOPs

Would like to add an arg to determine FLOPs to infer on t tokens for calc_transformer_flops.py

Should be as simple as just turning off the bwd pass

Add Repo Citation

Hopefully people end up finding us useful enough to cite? Need to add that.

Add HuggingFace arg so that arch is automatic

Stas Bekman had the idea of supporting a HuggingFace model as input so that all model architecture settings don't need manually dug up. We'd like something like:

python transformer_mem.py --hf_model_name_or_path meta-llama/Llama-2-7b-hf --num-gpus 8 --zero-stage 3 --batch-size-per-gpu 2 --sequence-length 4096

Add communication volume calculation script

Would be good to model the communication volume in bytes of a given parallelism setup. Situations to model:

Different parallelism schemes
- ZeRO-1/2/3, ZeRO++
- 3D parallelism
Activation checkpointing
Different dtypes

Bonus points:

% volume breakdown separated by collective

Improve Quantization

The quantization support I've added through --low-prec-bytes-per-val is a bit barebones. It'd be nice to add enough flexibility to handle per-block quantization (e.g. some only quantize the linears to int4) and some of the new formats that aren't a multiple of a byte (e.g. int4, fp6, etc)

Relevant: #36

eleutherai / cookbook Goto Github PK

cookbook's People

Contributors

Stargazers

Watchers

Forkers

cookbook's Issues

Recommend Projects

Recommend Topics

Recommend Org