Giter Club home page Giter Club logo

Comments (9)

hktk07 avatar hktk07 commented on September 28, 2024 1

I have the same question and I just use the code "
model = AutoModelForCausalLM.from_pretrained('llama2')
c_loss, m_loss = model(examples, labels, observations)",and it all make mistake in library function
只有“c_loss, m_loss = model(examples, labels, observations)”是自己写的代码报错,其他都是库函数报错

from sparsegpt.

phind-justin avatar phind-justin commented on September 28, 2024

same

from sparsegpt.

phind-justin avatar phind-justin commented on September 28, 2024

wondering if this is a transformers version issue

from sparsegpt.

algorithmexplorer avatar algorithmexplorer commented on September 28, 2024

outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask)[0]
这一行代码没有加位置编码。更改之后是这样的
outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask, position_ids=cache_position)[0]
@phind-justin @thistleknot

from sparsegpt.

SHUSHENGQIGUI avatar SHUSHENGQIGUI commented on September 28, 2024

wondering if this is a transformers version issue

hi you can refer the code fo wanda: https://github.com/locuslab/wanda/blob/main/lib/prune.py,modify function of llama.py as prune_sparsegpt()

from sparsegpt.

SHUSHENGQIGUI avatar SHUSHENGQIGUI commented on September 28, 2024

hi you can refer the code fo wanda: https://github.com/locuslab/wanda/blob/main/lib/prune.py,modify function of llama.py as prune_sparsegpt()

hi you can refer the code fo wanda: https://github.com/locuslab/wanda/blob/main/lib/prune.py , modify function of llama.py as prune_sparsegpt()

from sparsegpt.

time-less-ness avatar time-less-ness commented on September 28, 2024

I get this same error trying locally.

(venv)$ python llama.py  /data/Llama3-ChatQA-1.5-8B   c4 --sparsity 0.5
...
  File "sparsegpt/venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 110, in forward
    inv_freq_expanded = self.inv_freq[None, :, None].float().expand(position_ids.shape[0], -1, 1)
                                                                    ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'shape'

from sparsegpt.

minions0816 avatar minions0816 commented on September 28, 2024

@algorithmexplorer cache_position是啥

from sparsegpt.

SimWangArizona avatar SimWangArizona commented on September 28, 2024

Hi! If you are running squeeze LLM and have the same issue, this may be useful.
The issue can be attributed to the null position_ids.
First, at llama.py, check class "class Catcher(nn.Module):", in the original code, it is like this:
image
And position ids are obtained by:
image
But after I read the code of GPTQ for llama, which runs well with llama. It is written like this:
image
And
image
After changing them, I finnaly obtain the position_ids, which is a 2048 size torch.tensor. Please also make ture that the "try" command runs well, because this will affect the size of position_ids (not run will come up with size does not match error(4096 vs. 2048)).
image
Did not know the real reason, but works for me.

from sparsegpt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.