zjysteven / mink-plus-plus Goto Github PK

View Code? Open in Web Editor NEW

25.0 25.0 4.0 3.9 MB

Min-K%++: Improved baseline for detecting pre-training data of LLMs https://arxiv.org/abs/2404.02936

Home Page: https://zjysteven.github.io/mink-plus-plus/

License: MIT License

Python 100.00%

llama llm mamba membership-inference-attack pretraining-data-detection pythia

mink-plus-plus's Issues

MIMIR benchmark: 13-gram data?

Hi, thank you for the great work! Is the result from Table 2 in the paper coming from MIMIR 13-gram split here?

Reference baseline

Hi, I am running the code for the reference attack on MIMIR benchmark, and I don't get the same results as mentioned in the paper, the results are not even close:

your paper: Pythia 6.9b on Wikipedia -> 61.8 (Pythia 70m as the reference)
the result that I get -> 54.7

do you have any idea what might be the problem?

the best threshold

Do you know the best threshold for solving MIA problem?

Lower performance than paper

Hi authors,

Congrats on this great work. I try to run your code with "python run.py --model meta-llama/Llama-2-13b-hf", and I get

method auroc fpr95 tpr05
0 loss 54.9% 91.5% 3.9%
1 zlib 56.1% 89.2% 5.9%
2 mink_0.1 51.6% 92.8% 2.3%
3 mink_0.2 52.4% 93.6% 4.7%
4 mink_0.3 53.5% 92.8% 4.4%
5 mink_0.4 54.1% 92.0% 4.1%
6 mink_0.5 54.5% 91.5% 3.9%
7 mink_0.6 54.7% 91.0% 3.9%
8 mink_0.7 54.8% 90.7% 3.9%
9 mink_0.8 54.9% 91.3% 3.9%
10 mink_0.9 54.8% 92.3% 3.9%
11 mink_1.0 54.9% 91.5% 3.9%
12 mink++_0.1 60.8% 87.4% 6.2%
13 mink++_0.2 61.6% 84.1% 6.5%
14 mink++_0.3 61.5% 84.8% 5.4%
15 mink++_0.4 61.7% 83.5% 4.7%
16 mink++_0.5 61.5% 85.3% 5.4%
17 mink++_0.6 61.5% 85.9% 6.5%
18 mink++_0.7 61.7% 84.3% 7.2%
19 mink++_0.8 61.8% 85.3% 6.2%
20 mink++_0.9 61.7% 85.6% 5.2%
21 mink++_1.0 60.8% 84.6% 6.2%

On the paper, the auroc is more than 80%. I am not sure if I did something wrong. Thank you.

Missing negative sign

Hi,
do you mean -np.mean(topk_prob).item() here:

mink-plus-plus/run.py

Line 106 in f0dbf96

scores[f'mink_{ratio}'].append(np.mean(topk).item())

and

mink-plus-plus/run.py

Line 113 in f0dbf96

scores[f'mink++_{ratio}'].append(np.mean(topk).item())

Training code

Hey, would like to reproduce some of the results from the paper. Is the code to train the models available somewhere?

zjysteven / mink-plus-plus Goto Github PK

mink-plus-plus's Issues

MIMIR benchmark: 13-gram data?

Reference baseline

the best threshold

Lower performance than paper

Missing negative sign

Training code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent