System Info Running in docker <div class="sni

Llava Next crashes on certain image sizes about text-generation-inference HOT 5 OPEN

ktrapeznikov commented on May 23, 2024

Llava Next crashes on certain image sizes

from text-generation-inference.

Comments (5)

ktrapeznikov commented on May 23, 2024

Sometimes it works landscape images of certain sizes. Some times it also crashes. Do images sizes have to be multiples of 336?

from text-generation-inference.

shuaills commented on May 23, 2024

Same problem Method Prefill encountered an error

from text-generation-inference.

shuaills commented on May 23, 2024

It seems that the current implementation counts the tokens generated from the encoded image as part of the prompt length.
It might be better to extract the image features first and then calculate the prompt token length separately. I'm not sure if TGI has support for this approach, as it could be quite involved.

from text-generation-inference.

MrToy commented on May 23, 2024

Same issue, only width == height image works

from text-generation-inference.

alexgravx commented on May 23, 2024

I have the same issue, it seems to be linked to image sizes. I found that some sizes work in TGI v2.0.1 but not in TGI v2.0.2, and inversely.

I made here a recap for image size I tested. Note that the 2-bis image is the 2 image cropped, to ensure that the dimension is causing the issue.

Image	dimension	ratio L/W	works in v2.0.1	works in v2.0.2
1	450 x 299	1.505	No	Yes
2	800 x 531	1.506	Yes	No
2 bis	450 x 299	1.505	No	Yes
3	300 x 168	1.785	No	Yes
4	640 x 480	1. 333	Yes	Yes
5	934 x 934 (square)	1	Yes	Yes

When the image hasn't the right dimension, the server encounters an error and crashes. Here are the logs I get:

v2.0.1 (image 1 crash)

ERROR text_generation_launcher: Method Prefill encountered an error.
...
RuntimeError: shape mismatch: value tensor of shape [1464, 4096] cannot be broadcast to indexing result of shape [1376, 4096]
...
ERROR batch{batch_size=1}:prefill:prefill{id=0 size=1}:prefill{id=0 size=1}: text_generation_client: router/client/src/lib.rs:33: Server error: CANCELLED
ERROR batch{batch_size=1}:prefill:clear_cache{batch_id=Some(0)}:clear_cache{batch_id=Some(0)}: text_generation_client: router/client/src/lib.rs:33: Server error: transport error
ERROR chat_completions:generate:generate_stream:infer:send_error: text_generation_router::infer: router/src/infer.rs:866: Request failed during generation: Server error: CANCELLED
...
ERROR text_generation_launcher: Shard 0 crashed

v2.0.2 (image 2 crash, not happening at warmup)

INFO text_generation_launcher: Found 2095 in image of resolution 531x800
ERROR text_generation_launcher: Method Prefill encountered an error.
...
RuntimeError: shape mismatch: value tensor of shape [2144, 4096] cannot be broadcast to indexing result of shape [2095, 4096]
...
RuntimeError: Cannot fill images right now. If error happens at warmup, make sure you have enough `--max-input-tokens`  to handle images. If error happens at regular runtime, please fill in an issue: shape mismatch: value tensor of shape [2144, 4096] cannot be broadcast to indexing result of shape [2095, 4096]
...
ERROR batch{batch_size=1}:prefill:prefill{id=0 size=1}:prefill{id=0 size=1}: text_generation_client: router/client/src/lib.rs:33: Server error: CANCELLED
ERROR batch{batch_size=1}:prefill:clear_cache{batch_id=Some(0)}:clear_cache{batch_id=Some(0)}: text_generation_client: router/client/src/lib.rs:33: Server error: transport error
ERROR chat_completions:generate:generate_stream:infer:send_error: text_generation_router::infer: router/src/infer.rs:866: Request failed during generation: Server error: CANCELLED
...
ERROR text_generation_launcher: Shard 0 crashed

My model info

{
    model_id: "llava-hf/llava-v1.6-mistral-7b-hf",
    validation_workers: 2,
    trust_remote_code: false,
    max_concurrent_requests: 128,
    max_best_of: 2,
    max_stop_sequences: 4,
    max_top_n_tokens: 5,
    max_input_tokens: Some(4000),
    max_total_tokens: Some(5000),
    waiting_served_ratio: 0.3,
    max_waiting_tokens: 20,
    hostname: "0.0.0.0",
    port: 80,
    shard_uds_path: "/tmp/text-generation-server",
    master_addr: "localhost",
    master_port: 29500,
    huggingface_hub_cache: Some("/data"),
    disable_custom_kernels: false,
    cuda_memory_fraction: 1.0,
    json_output: false,
    cors_allow_origin: [],
    ngrok: false,
    disable_grammar_support: false,
    env: false,
    max_client_batch_size: 4,
}

from text-generation-inference.

Llava Next crashes on certain image sizes about text-generation-inference HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent