eric-mitchell / mend Goto Github PK
View Code? Open in Web Editor NEWMEND: Fast Model Editing at Scale
License: MIT License
MEND: Fast Model Editing at Scale
License: MIT License
Hi @eric-mitchell,
Thank you for the repo and the great paper. I was looking at your code and I have a doubt on this line of code: why is there a plus? Shouldn't be a minus?
As you also wrote in the paper, the final edited weights should be:
Thanks for your help.
Hi, I'm trying to run python -m run +alg=mend +experiment=gen +model=distilgpt2 data.wiki_webtext=False
, but I get a file not found error for data/10token/data/self_sample/train.json
. I downloaded 10token from the linked google drive folder and unzipped it to data/10token
. However, when I unzip it, all I get is a single 10token
file, no train.json
. Not sure if I'm missing something here. Thanks!
Hi,
In your paper, does the model edited in MEND methods must be fine-tuned?
How to apply the MEND to the model without fine-tuning ?
When I try to do this, the acc_train is always 0.
Thanks.
Hello, just curious whether there are any documentation on how to use MEND framework on custom LM (e.g., gpt-2) trained with custom data?
Hi, I downloaded the zip files from google drive but failed to unzip them.
Archive: 10token.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of 10token.zip or
10token.zip.zip, and cannot find 10token.zip.ZIP, period.
Could you kindly provide the codes for editing T5? Or could you point out the names of the layers to be edited?
I think the names of the edited layers are registered here:
Lines 256 to 263 in e04fdb9
@eric-mitchell
Thanks for your great repo. I have written the following brief post for introducing your great repo:
Fast Model Editing at Scale via Model Editor Networks with Gradient Decomposition (MEND)
Best
Hi
I am getting the following error when installing the requirements with pip install -r requirements.txt
Collecting git+git://github.com/eric-mitchell/higher@master (from -r requirements.txt (line 7))
Cloning git://github.com/eric-mitchell/higher (to revision master) to /tmp/pip-req-build-z1v0erey
Running command git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey
fatal: unable to connect to github.com:
github.com[0: 192.30.255.113]: errno=Connection timed out
error: subprocess-exited-with-error
× git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× git clone --filter=blob:none --quiet git://github.com/eric-mitchell/higher /tmp/pip-req-build-z1v0erey did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Editing tasks have three categories: binary classification, QA and generation. In Batched edits, Why choose QA?
And What is the result under finetune(FT)?
Thanks!
Hi, Mr.Eric,
I find the code :
param_idx = lambda n, p: self.shape_dict[self.get_shape(p)].index(n) if self.config.mend.shared else None # noqa: E731 transformed_factors = { n: self.mend[str(tuple(self.get_shape(p)))](p.__x__, p.__delta__, param_idx(n, p)) for n, p in _inner_params(self.model.named_parameters(), self.config.model.inner_params) }
I want to know where can I find the define of x and delta in code ?
Is it in torch or python?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.