Your current environment <div class="snippet-clipboard-content notranslate posit

Hmm, I'm getting this error for <a class="issue-link js-issue-link" data-error-text="F

[Bug]: debugging guide for device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at "../aten/src/ATen/cuda/CUDAContext.cpp" about vllm HOT 1 OPEN

youkaichao commented on July 23, 2024 2

[Bug]: debugging guide for device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at "../aten/src/ATen/cuda/CUDAContext.cpp"

from vllm.

Comments (1)

DarkLight1337 commented on July 23, 2024

Hmm, I'm getting this error for #5276:
https://buildkite.com/vllm/ci-aws/builds/3678#0190710a-4a20-4bb1-8f85-8efe1a7615a1

The stack trace suggests that import torch inside conftest.py is to blame, but I'm pretty sure the import was there from the beginning, so that can't be why.

When I try to log the traceback in torch.cuda.device_count(), I get this:

  File "/home/cyrusleung/miniconda3/envs/vllm/bin/pytest", line 8, in <module>
    sys.exit(console_main())
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/config/__init__.py", line 197, in console_main
    code = main()
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/config/__init__.py", line 174, in main
    ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main(
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/main.py", line 332, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/main.py", line 285, in wrap_session
    session.exitstatus = doit(config, session) or 0
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/main.py", line 339, in _main
    config.hook.pytest_runtestloop(session=session)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/main.py", line 364, in pytest_runtestloop
    item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 115, in pytest_runtest_protocol
    runtestprotocol(item, nextitem=nextitem)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 134, in runtestprotocol
    reports.append(call_and_report(item, "call", log))
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 239, in call_and_report
    call = CallInfo.from_call(
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 340, in from_call
    result: Optional[TResult] = func()
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 240, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/runner.py", line 172, in pytest_runtest_call
    item.runtest()
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/python.py", line 1772, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/_pytest/python.py", line 195, in pytest_pyfunc_call
    result = testfunction(**testargs)
  File "/home/cyrusleung/vllm-rocm/tests/distributed/test_multimodal_broadcast.py", line 43, in test_models
    run_test(
  File "/home/cyrusleung/vllm-rocm/tests/models/test_llava.py", line 113, in run_test
    with vllm_runner(model_id,
  File "/home/cyrusleung/vllm-rocm/tests/conftest.py", line 439, in __init__
    self.model = LLM(
  File "/home/cyrusleung/vllm-rocm/vllm/entrypoints/llm.py", line 144, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
  File "/home/cyrusleung/vllm-rocm/vllm/engine/llm_engine.py", line 405, in from_engine_args
    engine = cls(
  File "/home/cyrusleung/vllm-rocm/vllm/engine/llm_engine.py", line 238, in __init__
    self.model_executor = executor_class(
  File "/home/cyrusleung/vllm-rocm/vllm/executor/distributed_gpu_executor.py", line 25, in __init__
    super().__init__(*args, **kwargs)
  File "/home/cyrusleung/vllm-rocm/vllm/executor/executor_base.py", line 41, in __init__
    self._init_executor()
  File "/home/cyrusleung/vllm-rocm/vllm/executor/multiproc_gpu_executor.py", line 68, in _init_executor
    self.driver_worker = self._create_worker(
  File "/home/cyrusleung/vllm-rocm/vllm/executor/gpu_executor.py", line 67, in _create_worker
    wrapper.init_worker(**self._get_worker_kwargs(local_rank, rank,
  File "/home/cyrusleung/vllm-rocm/vllm/worker/worker_base.py", line 311, in init_worker
    self.worker = worker_class(*args, **kwargs)
  File "/home/cyrusleung/vllm-rocm/vllm/worker/worker.py", line 87, in __init__
    self.model_runner: GPUModelRunnerBase = ModelRunnerClass(
  File "/home/cyrusleung/vllm-rocm/vllm/worker/model_runner.py", line 196, in __init__
    self.attn_backend = get_attn_backend(
  File "/home/cyrusleung/vllm-rocm/vllm/attention/selector.py", line 45, in get_attn_backend
    backend = which_attn_to_use(num_heads, head_size, num_kv_heads,
  File "/home/cyrusleung/vllm-rocm/vllm/attention/selector.py", line 151, in which_attn_to_use
    if torch.cuda.get_device_capability()[0] < 8:
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 430, in get_device_capability
    prop = get_device_properties(device)
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 444, in get_device_properties
    _lazy_init()  # will define _get_device_properties
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 306, in _lazy_init
    queued_call()
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 173, in _check_capability
    for d in range(device_count()):
  File "/home/cyrusleung/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 745, in device_count
    import traceback; traceback.print_stack()

But I think this is supposed to happen, right?

Edit: The traceback is from a local version of the PR which has some additional changes compared to the CI build. I'll push it when its dependency has been merged so I can see whether the failure still persists.
Update: Using lazy import in vllm.transformer_utils.image_processor seems to fix the problem.

from vllm.

[Bug]: debugging guide for device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at "../aten/src/ATen/cuda/CUDAContext.cpp" about vllm HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent