Hi, I am encountering an issue when running inference on the Llama-3

Issue with Flash Attention on V100 GPU for Llama-3-VILA1.5-8B Model about vila HOT 3 OPEN

vedernikovphoto commented on August 12, 2024

Issue with Flash Attention on V100 GPU for Llama-3-VILA1.5-8B Model

from vila.

Comments (3)

NuyoaHygge commented on August 12, 2024 1

lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py Lines 608-609, swap the annotation.

from vila.

vedernikovphoto commented on August 12, 2024

lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py Lines 608-609, swap the annotation.

Thanks, it worked for my GPU now! However, the output is really weird; it outputs a meaningless string of empty spaces and commas. I faced the same issue with another Vision Language Model, while some other Vision Language Models work well. I believe this might be due to the transformers library version. Anyway, I also tried running VILA on the CPU, and in that case, it worked fine.

from vila.

NuyoaHygge commented on August 12, 2024

lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py Lines 608-609, swap the annotation.

Thanks, it worked for my GPU now! However, the output is really weird; it outputs a meaningless string of empty spaces and commas. I faced the same issue with another Vision Language Model, while some other Vision Language Models work well. I believe this might be due to the transformers library version. Anyway, I also tried running VILA on the CPU, and in that case, it worked fine.

Me, too. I've had similar issues with redundant commas and spaces. However, when I use the VILA1.5-3B model to input a video along with some questions, it actually performs better than the 8B model. Sometimes it generates coherent responses, but other times it only replies with one to three words.

from vila.

Recommend Projects

Issue with Flash Attention on V100 GPU for Llama-3-VILA1.5-8B Model about vila HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent