imoneoi / openchat Goto Github PK

View Code? Open in Web Editor NEW

5.2K 5.2K 394.0 18.26 MB

OpenChat: Advancing Open-source Language Models with Imperfect Data

Home Page: https://openchat.team

License: Apache License 2.0

Python 57.41% Jupyter Notebook 42.00% Dockerfile 0.41% Shell 0.18%

large-language-models open-source transformers

openchat's People

Contributors

Stargazers

Watchers

Forkers

hbcbh1999 mevengue realsrisri haikuoxin chengfai pratick-at willbaldlygo lrochetta techthiyanes pingmuict rubenszimbres apollohuang1 youssefsultan charliegreenman karthik666manjunath baris-unver mz0in librty tomaarsen belram9966 lherrera stjordanis willnco chiennv2000 techsuni2023 worthmining ldsxp enockipp gmpdtd95 suryatmodulus gandalf012 tempaccountnull tacticerx raahulrawat varunvummadi kartikgc9 coinhubx dansonc lplzyp michael-jiahao-zhang tobiahrex ggbound maxwellamaral aubella uijnn jknkh monshyvcrepze fdsfsww 3plormeabho helloitu mayufei-npu aleandrokor prathvik 1fratimgradna kmlllb 0clarflicciedzu 8tianaclinn shaofengz035 eltociear centaurioun 0guirizinji 1sumprysfdistku perposaitni 9phirerompte to-be-architect acidark 9provinqmuzu 3aninfterpza sceldiarata tisleotranmo isysodiszu isepdesce pomramosska 1lyples0quesu zhangdahai112 joshuayan 9tidaeadextbu danielowji alfasignde 8tracceltheofu provintiayu darioprovsa smithjamesr tranhuynh87 0provex0theomo quimurintsu terpfeflagda merajat haulembadzu 1cactagpinko longshaoduan soon14 crysex0inma p0tatooo 9riobiogramsu 1confcultranmu gewang371 puncfiphausu allenxiao95 wilsonodpn

openchat's Issues

Not able to run openchat v1 through huggingface, not able to make correct use of conversation templates

Can someone please help me with using https://huggingface.co/openchat/openchat locally?
I am not able to make use of the conversation template, it is always showing "Killed".
Can anyone give me an example of usage or a python script perhaps to use it?

Detailed Training setting

Hi, may you provide the detailed hyper-paramters when you training llama-13b? For example, how many and what kind of GPUs you use, what are the gradient accumulation steps and batch size per GPU? Moreover, when I directly use your deepspeed config setting to deepspeed-initialize a llama-7b on an 80G A100, the server reports CUDA OOM error.

Looking forward to your reply.

Thank you so much!

Couldn't inference with gpu

When I load the model and perform inference using the Hugging Face framework, I noticed that although the model is loaded into GPU memory, the GPU usage remains at 0% while the CPU usage is at 100%. Here is the code:
def load_openchat_model(model_path:str,device_map):
model = LlamaForCausalLM.from_pretrained(
model_path,
load_in_8bit=False,
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
)
model.to("cuda:0")
model.eval()
return model

inference code:
def infer_hf(input_text:str,model,tokenizer,device):
generation_config = dict(
temperature=0.8,
top_k=40,
top_p=0.9,
do_sample=True,
num_beams=1,
repetition_penalty=1.3,
max_new_tokens=400,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
with torch.inference_mode():
input_ids = tokenizer(input_text, return_tensors="pt")
generation_output = model.generate(
input_ids=input_ids["input_ids"].to(device),
attention_mask=input_ids['attention_mask'].to(device),
**generation_config
)
s = generation_output[0]
output = tokenizer.decode(s)
print(output)
I set device to "cuda:0"

CUDA out of memory on 8xA100 GPUs

Tried running the sample training script on 8xA100 GPUs. Used the sharegpt_v3.2 dataset recommended in your ReadMe.

I got this error: CUDA out of memory. Tried to allocate 688.00 MiB (GPU 1; 39.39 GiB total capacity; 37.95 GiB already allocated; 633.12 MiB free; 38.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

What setting did you use to train? I tried setting PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 as a test, still ran into the same out of memory error.

Met error of CUDA out of memory. How to split the model into multiple GPUs?

When loading the checkpoint, it comes out:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 15.90 GiB total capacity; 15.30 GiB already allocated; 31.81 MiB free; 15.30 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

openchat 3.5 log

The logs print token ids. it's useless and unreadable for human.

Openchat3.5 training data

Congrats for the V3.5 release!
May I ask if there are plans to release your finetuning data, just like what you have been always doing with your previous release?

Online demo down?

I used the demo yesterday. But I can't use it now. Is it down?

Can NOT reproduce the alpaca_eval results of OpenChat V3.1 13B

Hi! Really appreciate your work and open source effort! And openchat is a really great model.
However, I can not reproduce the alpaca_eval results of OpenChat V3.1 13B. I just download the model_outputs.json you uploaded in alpaca_eval repo, and test this using my own gpt-4 API,
however, the winrate is 84.41,
which is not as high as you claimed in the leaderboard.
So can you reveal more details for your evaluation?
Many thanks!

How to user llama_convert_and_add_eot_token.py script?

Before training the model, do I need to use the llama_convert_and_add_eot_token.py script first, and how should I use it?

Thanks!

Can I use system prompt when training?

Hi @imoneoi , I want my assistant to have different emotions or it can act as someone based on system prompt. So when training can I use data samples of the form {system_prompt} Human: {human_message} <|end_of_turn|>Assistant: {assistant_message} .... or such prompts are used in {human_message} like data samples you trained? Thank you!
Example: <s>Human: Act as SEO expert. I want you to create the best meta descriptions among my competitors.\n\nHere are the list of our competitor's meta descriptions. \n\n\nHere is my meta description. Revise it. I don't want NFT in my description. I do not offer any staking service. \n\nBuy and sell the world's largest selection of 10,000+ Cryptocurrencies<|end_of_turn|>Assistant: ....
Convert to <s>Act as SEO expert. I want you to create the best meta descriptions among my competitors <|end_of_turn|>Human: Here are the list of our competitor's meta descriptions. \n\n\nHere is my meta description. Revise it. I don't want NFT in my description. I do not offer any staking service. \n\nBuy and sell the world's largest selection of 10,000+ Cryptocurrencies<|end_of_turn|>Assistant: ....

docker gets undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

openchat-openchat-server-1 | Traceback (most recent call last): openchat-openchat-server-1 | File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main openchat-openchat-server-1 | return _run_code(code, main_globals, None, openchat-openchat-server-1 | File "/usr/lib/python3.10/runpy.py", line 86, in _run_code openchat-openchat-server-1 | exec(code, run_globals) openchat-openchat-server-1 | File "/ochat/serving/openai_api_server.py", line 29, in <module> openchat-openchat-server-1 | from ochat.config.model_config import MODEL_CONFIG_MAP openchat-openchat-server-1 | File "/ochat/config/model_config.py", line 7, in <module> openchat-openchat-server-1 | import ochat.models openchat-openchat-server-1 | File "/ochat/models/__init__.py", line 1, in <module> openchat-openchat-server-1 | from ochat.models.unpadded_llama import LlamaForCausalLM openchat-openchat-server-1 | File "/ochat/models/unpadded_llama.py", line 31, in <module> openchat-openchat-server-1 | from transformers.modeling_utils import PreTrainedModel openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 88, in <module> openchat-openchat-server-1 | from accelerate import dispatch_model, infer_auto_device_map, init_empty_weights openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/__init__.py", line 3, in <module> openchat-openchat-server-1 | from .accelerator import Accelerator openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 35, in <module> openchat-openchat-server-1 | from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/checkpointing.py", line 24, in <module> openchat-openchat-server-1 | from .utils import ( openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/__init__.py", line 136, in <module> openchat-openchat-server-1 | from .launch import ( openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/launch.py", line 33, in <module> openchat-openchat-server-1 | from ..utils.other import is_port_in_use, merge_dicts openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/other.py", line 27, in <module> openchat-openchat-server-1 | from .transformer_engine import convert_model openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/transformer_engine.py", line 21, in <module> openchat-openchat-server-1 | import transformer_engine.pytorch as te openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/transformer_engine/pytorch/__init__.py", line 6, in <module> openchat-openchat-server-1 | from .module import LayerNormLinear openchat-openchat-server-1 | File "/usr/local/lib/python3.10/dist-packages/transformer_engine/pytorch/module.py", line 20, in <module> openchat-openchat-server-1 | import transformer_engine_extensions as tex openchat-openchat-server-1 | ImportError: /usr/local/lib/python3.10/dist-packages/transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

seems like there is some sort of cuda version error?

Can not reproduce the alpaca_eval results of openchat v1

I downloaded the provided openchat v1 model on huggingface through this model name 'openchat/openchat' and I use your model to predict the 805 evaluation queries of alpaca-eval and I can only get a win rate over davince-003 around 70 which is far from your reported number. The alpaca-eval is verified as having no bugs since I can reproduce the scores of other LLMs.

FYI, I set the query template to " Human: {query} <|end_of_turn|> Assistant: " and I am using top_p sampling with top_p=1.0, temperature=0.7 and the maximum overall token length to 2048, which are consistent with the configs from this link: https://github.com/tatsu-lab/alpaca_eval/blob/main/src/alpaca_eval/models_configs/openchat-13b/configs.yaml
I also find out that the prompt from https://github.com/tatsu-lab/alpaca_eval/blob/main/src/alpaca_eval/models_configs/openchat-13b/prompt.txt is not consistent with the huggingface model and the training data provided.

Can you kindly explain this performance discrepancy? Or maybe can you provide a script for openchat inference?

Purpose of the loss weighting

Hi ! Great work :)

I have a question regarding the loss weighting implementing in the repository. Do I understand it correctly that you assign a lower weights to tokens from the longer sequences, so that each sequence contributes more or less the same to the training, irrespective of its length ?

Regards

LIMA pipeline

Do you have a pipeline script from which you reduced the 90K data to 6K based on LIMA?

Does the training code support open_llama_3b_v2

I tried training with sharegpt_v3.2 dataset, and it gives lots of weird errors.

能不能出一个中文说明？

虽然英文也能看的懂，但是**用户还是很多的。

So this repo just proved that AlpacaEval and Vicuna GPT-4 evals are bad. What's next?

[2023/07] We released the OpenLLMs model series. Among them, OpenChat obtains 80.9% win-rate on AlpacaEval and 105% ChatGPT performance on Vicuna GPT-4 evaluation.

Are you saying your model is generally better than a 10x bigger model?

If not, what is the plan to fix metrics so they show the expected ranking?

discord invite in readme expired

I wanted to ask about the tokenizer. I quantized the model with the MLC framework and I noticed that the model never generates token 32000 to indicate end of turn, rather it generates the string <|end_of_turn|> as a sequence of tokens. Not what I expected. I don't know if it's a usage issue on my part.

What improvements have been made for v3.2 Super model?

Thank you for releasing the new model, I would like to know what improvements have been made to this Super model? Thanks

openchat3.5 training data formatting

Congrats to the authors on the great achievement!

Trying to understand your great work a bit more. In the inference examples, there are prompts like GPT4 Correct User, Code User. What are other conditional prompts used in training? What does Correct mean here? Thanks!

Troubleshooting OpenChat and Non-English Data in the Dataset

Great job! I have a few questions:

I'm using the following script to test OpenChat, but even with the correct prompt template, the output is not very accurate. How should I modify the testing code?

tokenizer = LlamaTokenizer.from_pretrained(args.model_name_or_path, fast_tokenizer=False)
model = create_hf_model(AutoModelForCausalLM, args.model_name_or_path, tokenizer, None)
prompt = "<s>Human:  What are all the pairs of natural numbers which sum to 6?<|end_of_turn|>Assistant: "
generation_config = GenerationConfig(max_new_tokens=2048,num_beams=1,do_sample=True,temperature=0.7,top_p=0.9)
generate_ids = model.generate(input_ids=inputs.input_ids,generation_config=generation_config,)
response = tokenizer.batch_decode(generate_ids,skip_special_tokens=True,clean_up_tokenization_spaces=False)[0]
print(response)

output:

I noticed that there is a lot of non-English data in the dataset, which obviously wouldn't benefit the performance of alpaca_eval. Why not filter out this data?
https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/blob/main/openchat.train.text.json

openchat_v3.2_super deployment to SageMaker doesn't work

Hi team,

I would like to deploy new model to AWS Sagemaker with below code and getting RuntimeError: weight model.layers.0.self_attn.rotary_emb.inv_freq does not exist seems something is missing in the model index. At least, I couldn't find it in https://huggingface.co/openchat/openchat_v3.2_super/blob/main/pytorch_model.bin.index.json

Thanks in advance for your help!

Here is the deploy.py

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='role-name')['Role']['Arn']

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'openchat/openchat_v3.2_super',
	'SM_NUM_GPUS': json.dumps(4)
}



# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="0.9.3"),
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g5.12xlarge",
    endpoint_name="openchat-v3-2-super",
	container_startup_health_check_timeout=600,
  )
  
# send request
predictor.predict({
	"inputs": "My name is Julien and I like to",
})

gradio demo

would be great to setup a gradio demo for this on huggingface, similar to https://huggingface.co/spaces/mosaicml/mpt-30b-chat, this is the guide: https://huggingface.co/docs/hub/spaces-sdks-gradio

Installation is a mess, instructions are a mess

ChatGPT helped remove all the swear words.

I've encountered difficulties trying to set this up on Ubuntu, MacOS, and Windows. I've noticed some inconsistencies in the instructions, and it seems some tools and libraries might be either too new or outdated. It would be greatly appreciated if these issues could be addressed to make the project more user-friendly for everyone. Thank you.

Reproduce Training Data

To reproduce the training data, we need ShareGPT htmls as stated in
The input folder should contain a ShareGPT folder with .html files for each ShareGPT conversation page inside.

It seems that the best ShareGPT source I can find online is here. However, it doesn't give model information and thus we have no way to filter for GPT4 responses.

Any pointers or hints on how to get GPT4 responses would be appreciated!

Can you please update requirements.txt file with versions

I am getting different errors. For example
AssertionError: pydantic.dataclasses.dataclass only supports init=False

I have to downgrade pydantic version. It would be great if you can add all packages versions

did u ever done the overfitting check on openchat v1.0

although your file has split the dataset into the training set and validation set, it seems that training for 5 epochs with 6k samples will encounter overfitting.

Does openchat support LORA finetune?

OpenCoderPlus outputs will not stop

I tried to launch OpenCoderPlus with the latest code of this repo and vLLM:

python -m ochat.serving.openai_api_server --model-type opencoder --model openchat/opencoderplus

It can work, but the outputs will never stop util hitting the max_tokens limit, even if I pass the stop parameter:

requests.post(
  "http://localhost:18888/v1/chat/completions",
  json={
    "model": "opencoder",
    "messages": [{"rule": "user", "content": "Write a bubble sort."}],
    "stop": ["<|end_of_turn|>"]
  }
)

I refered to OpenCoderPlus's training data, it seems that this model is training on data with the <|end_of_turn|> character.

So does anyone know how to stop this model's outputs? Any help will be appreciated.

Question about the data source for Openchat-v3.2-super

Hello,

I am reaching out to inquire about the data source used for training Openchat-v3.2-super. Could you please clarify if the dataset openchat/openchat_sharegpt_v3 that was used for its training originates from RyokoAI/ShareGPT52K? Additionally, I would like to know the approximate time frame for the data collection, i.e., up to which date was the data collected?

Thank you for your time and consideration. I look forward to your response.

{ cause: [Error: AggregateError] }

Hi, i got this error and i cant find information about it:

C:\Users\xxxxxxxxxx\xxxxxxxxx\xxxxxxxx\openchat\openchat-ui>npm run dev

[email protected] dev
next dev

▲ Next.js 13.5.6

Local: http://localhost:3000
Environments: .env.local

✓ Ready in 3.3s
✓ Compiled / in 1387ms (1682 modules)
✓ Compiled in 325ms (1682 modules)
✓ Compiled /api/models in 138ms (70 modules)
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }
[TypeError: fetch failed] { cause: [Error: AggregateError] }

If we want to use our own data, how should we construct it?

As described in the title

Does openchat use a prompt template?

Great Works!

It seems that openchat.train.json does not utilize a prompt template like what alpaca-lora does.

Do you make experiments about using a prompt template? Will that be better or not?

Thank you!

conversation issue

感谢分享，请问有微信交流群吗

Contact

Is there any way to contact you? I want to work with you and I have a proposal.

AssertionError: pydantic.dataclasses.dataclass only supports init=False

i tried everything, from using docker (gives error about vllm) to venv and conda env, this is the last error i get, do you guys have idea what should i do?

File "/home/user/miniconda3/envs/venv/lib/python3.11/site-packages/pydantic/dataclasses.py", line 139, in dataclass
assert init is False, 'pydantic.dataclasses.dataclass only supports init=False'
^^^^^^^^^^^^^
AssertionError: pydantic.dataclasses.dataclass only supports init=False

how's the performance on Chinese?

Does '~80K cleaned ShareGPT data' refer to ' sharegpt_clean.json' files in the 'openchat/openchat_sharegpt4_dataset'?

And can you describe the details of a conditioning strategy and weighted loss?
Thanks!

crash in VLLM

Trying to install it to NVidia's pytorch contaner. I'm getting this while running.
Same issue while trying to install it to Lambda GPU cloud on H100 instance. (all default)

root@0971a018b7ec:/workspace/openchat# python -m ochat.serving.openai_api_server --model_type openchat_v2 --model openchat/openchat_v2_w --engine-use-ray --worker-use-ray
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/workspace/openchat/ochat/serving/openai_api_server.py", line 21, in <module>
    from vllm.engine.arg_utils import AsyncEngineArgs
  File "/usr/local/lib/python3.10/dist-packages/vllm/__init__.py", line 4, in <module>
    from vllm.engine.async_llm_engine import AsyncLLMEngine
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 7, in <module>
    from vllm.engine.llm_engine import LLMEngine
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 16, in <module>
    from vllm.worker.worker import Worker
  File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 8, in <module>
    from vllm.model_executor import get_model, InputMetadata, set_random_seed
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/__init__.py", line 2, in <module>
    from vllm.model_executor.model_loader import get_model
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 9, in <module>
    from vllm.model_executor.models import *  # pylint: disable=wildcard-import
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/__init__.py", line 1, in <module>
    from vllm.model_executor.models.bloom import BloomForCausalLM
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/bloom.py", line 31, in <module>
    from vllm.model_executor.layers.activation import get_act_fn
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/activation.py", line 5, in <module>
    from vllm import activation_ops
ImportError: /usr/local/lib/python3.10/dist-packages/vllm/activation_ops.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

Which flash attention version is being used?

I tried running the training script and got flash_attn_unpadded_func is not defined. Doing some digging, apparently it's deprecated in 2.0: https://github.com/Dao-AILab/flash-attention/blob/d30f2e1cd50185c98ed88c0684b4a603f15bee37/README.md?plain=1#L127

Is upgrading to flash-attn to 2.0 trivial (simply renaming some functions)? I'm not familiar with this project so can't say. If it's difficult, perhaps adding documentation somewhere specifying flash-attn 1.x is being used will be helpful for newcomers.

An error in training

I have installed flash-attn using pip3 install --no-build-isolation "flash-attn<2"
But an error emerges:

File "openchat/ochat/models/unpadded_llama.py", line 184, in forward
    attn_output = flash_attn_varlen_func(
                  ^^^^^^^^^^^^^^^^^^^^^^
NameError: name 'flash_attn_varlen_func' is not defined

Error Installing requirements.txt: ModuleNotFoundError for 'torch'

Hi, I trying to install your requirement.txt but getting this error message:

Getting requirements to build wheel ... error
ERROR: Command errored out with exit status 1:
command: /root/miniconda3/bin/python /root/miniconda3/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py get_requires_for_build_wheel /tmp/tmpt4c_a3i6
cwd: /tmp/pip-install-_op0fkvy/flash-attn_32ecdb534ca149cebac1b8d1956665eb
Complete output (15 lines):
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 280, in <module>
main()
File "/root/miniconda3/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 263, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/root/miniconda3/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 114, in get_requires_for_build_wheel
return hook(config_settings)
File "/tmp/pip-build-env-0d5h4yvd/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
File "/tmp/pip-build-env-0d5h4yvd/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-0d5h4yvd/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "<string>", line 13, in <module>
ModuleNotFoundError: No module named 'torch'

It seems to be unable to find the module 'torch'. However, I have verified that torch is installed in my environment, with version 1.11.0+cu113 and torchvision version 0.12.0+cu113.

I have also tried to install the requirements on a different machine where torch version 2.0.0+cu117 is installed, but the error persists.

Any assistance to resolve this issue would be greatly appreciated. Thank you.

Do I need to handle the chat history when using the curl example?

When I use the curl http://localhost:18888/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "openchat_v3.2",
"messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}]
}', how do I let the model remember the chat history or the context?

What is the difference between these versions of OpenChat?

Thank you for your amazing work! I have some questions below:

In alpaca eval leaderboard, there are 5 versions: OpenChatV3.1, OpenChatV2-W, OpenChatV2, OpenChat, OpenChat8192.
What is the difference?
And what is the datasets used?
What is the difference between openchat_shareGPT_v3 and openchat_shareGPT4?
Which datasets do you use for OpenChatV3.1?

Looking forward to your reply.

how to download the dataset?

when i click on this link: https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset

It says no able to see the data:

[Question] Can OpenChat be instruct-tuned for further downstream tasks?

Is it possible to further do instruction tuning on OpenChat with domain specific data? If so, is there any boilerplate that can be used as a starting point. I had earlier fine-tuned LLama-2 on my dataset with trl-sft script, and another try with llama-recipes boilerplate. The time taken by both scripts varied greatly(3x), including the tokenization process and other parameters. The final model however didn't perform well at all with weird and abrupt answers. Therefore, I'm hoping to get some insights if using openchat(or vicuna/wizardlm/llama2-chat) might make a difference?

Thank you for your response.

Issue with `pip3 install ochat`

I'm using a Windows machine, and I've been following the instructions outlined in this answer:: #41 (comment)

Everything went smoothly until I reached the step of running pip3 install ochat, where I encountered an error.

Here's the error message I'm getting:

error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [15 lines of output]
      test.c
      LINK : fatal error LNK1181: cannot open input file 'aio.lib'
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "C:\Users\hasans\AppData\Local\Temp\pip-install-22ncvin1\deepspeed_d1d81ae59ce344d3a308adf94757a6b8\setup.py", line 
165, in <module>
        File "C:\Users\hasans\AppData\Local\Temp\pip-install-22ncvin1\deepspeed_d1d81ae59ce344d3a308adf94757a6b8\setup.py", line 
51, in abort
          assert False, msg
      AssertionError: Unable to pre-compile async_io
      DS_BUILD_OPS=1
       [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
       [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables 
to where it can be found.
       [WARNING]  One can disable async_io with DS_BUILD_AIO=0
       [ERROR]  Unable to pre-compile async_io
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
PS C:\Users\hasans\Documents\openchat> pip install libaio-devel
ERROR: Could not find a version that satisfies the requirement libaio-devel (from versions: none)
ERROR: No matching distribution found for libaio-devel

Could someone please guide me on how to resolve this issue? Your assistance would be greatly appreciated!

Thank you for your time and effort in maintaining this amazing project.

Adding conversion scripts for open llama models

Is this something you think is valuable add to your project?

I modified the script to support open llama models (only supports 3B at the moment): https://gist.github.com/l3utterfly/9f5a2d7d6415d20bf3d89d915f1661bb

If you think it's worth, I can clean up the code and do a pull request?

What's the reason for deleting the llama_convert_and_add_eot_token script?

I feel it's a very valuable script to help people get started training their own models using the openchat method.