Your current environment <div class="snippet-clipboard-content notranslate posit

mistralai/Mistral-7B-Instruct-v0.2 is an <code

[Bug]: The Offline Inference Embedding Example Fails about vllm HOT 6 CLOSED

cuizhuyefei commented on September 27, 2024

[Bug]: The Offline Inference Embedding Example Fails

from vllm.

Comments (6)

zankner commented on September 27, 2024 2

For what its worth I think people might want to use causal lm to generate embeddings of just the prompt, at least thats the use case I currently have.

from vllm.

robertgshaw2-neuralmagic commented on September 27, 2024 1

mistralai/Mistral-7B-Instruct-v0.2 is an XXXForCausalLM model. CausalLM means that it generates text. It should not be used for embeddings. --> see the config:

{
  "architectures": [
    "MistralForCausalLM" # << this tells us its a generation model
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.36.0",
  "use_cache": true,
  "vocab_size": 32000
}

intfloat/e5-mistral-7b-instruct is an XXModel. This means that the model just generates embeddings. It should be used for embeddings --> see the config:

{
  "_name_or_path": "mistralai/Mistral-7B-v0.1",
  "architectures": [
    "MistralModel"    # <<< this tells us its an embedding model
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "pad_token_id": 2,
  "rms_norm_eps": 1e-05,
  "rope_theta": 10000.0,
  "sliding_window": 4096,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.34.0",
  "use_cache": false,
  "vocab_size": 32000
}

We automatically detect if the model is an embedding or a generation model based on these config. Supporting embedding models is a new feature. Thank you for bringing this bad UX to my attention.

I am going to update to:

log a better error message
make some documentation to help users understand how to use this better

from vllm.

cuizhuyefei commented on September 27, 2024 1

I get it. Thanks for explaining this!

from vllm.

robertgshaw2-neuralmagic commented on September 27, 2024

I just ran the example and did not see this issue

What model are you using? This error can occur if you call .encode on a XXXForCausalLM.

from vllm.

Delviet commented on September 27, 2024

Interestingly enough, for me example is working fine and I actually see the example results (list of numbers) in my CLI.

Moreover, your error message states:

...
[rank0]:   File "home/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 210, in _prepare_seq_groups
[rank0]:     if sampling_params.seed is not None:
[rank0]: AttributeError: 'NoneType' object has no attribute 'seed'

The problem is that if sampling_params.seed is not None: is line 208 (not 210) in current version of the file. It seems like you could have modified the file somehow and then it stopped working.

Hope it could help you somehow.

from vllm.

cuizhuyefei commented on September 27, 2024

Thanks for all of your help!
I have indeed tried to modify the source code after encountering this error. I've changed back to the original code (though no change of functionality).

Interestingly, the script works well with intfloat/e5-mistral-7b-instruct. After changing the model to mistralai/Mistral-7B-Instruct-v0.2, I got the error mentioned earlier. Do you have suggestions how I can use for this specific model? Really appreciate your help!

from vllm.

[Bug]: The Offline Inference Embedding Example Fails about vllm HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent