yuxinwenrick / diffusion_memorization Goto Github PK

View Code? Open in Web Editor NEW

30.0 30.0 1.0 4.68 MB

Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)

Python 100.00%

diffusion_memorization's People

Contributors

Stargazers

Watchers

Forkers

junf137

diffusion_memorization's Issues

List of non-memorized prompts

Hey,

thank you for providing the code to reproduce your experiments. In addition to the list of memorized samples, could you please also provide the prompts of the non-memorized samples you used during your experiments? The paper states, the experiments were conducted on 2,000 prompts from COCO, LAION, Lexica and randomly generated strings. This would improve the reproducibility of the method.

Best,
Lukas

SSCD score of memorized samples

Dear Yuxin,

a question regarding the experimental setup came up. We used the prompts provided in sdv1_500_memorized.jsonl to generate images with SDv1.4. We then computed the SSCD scores of the generated images and the real images. However, the SSCD scores vary between quite high and pretty low. In the paper, the following is written:

To evaluate our detection method, we use 500 memorized prompts identified in Webster (2023) for Stable Diffusion v1 (Rombach et al., 2022), where the SSCD similarity score (Pizzi et al., 2022) between the memorized and the generated images exceeds 0.7.

Does this mean, all images generated from the 500 prompts in the json file achieved an SSCD score > 0.7 in your experiments? Or did you apply an additional filtering using the computed SSCD score to filter out the strongly memorized samples? In our experiments, only 100-120 (depending on the SSCD model) out of the 500 prompts achieve a maximum SSCD score > 0.7. All SSCD scores were computed across 10 generations with different seeds. We also manually inspected the images and some generated images showcase only slight memorization, so the assigned SSCD scores seem to actually match the amount of memorization.

Best,
Lukas

Question About Gustavosta/Stable-Diffusion-Prompts

May I ask the dataset Gustavosta/Stable-Diffusion-Prompts: is it this link Gustavosta/Stable-Dif? I used train.parquet and convert it to jsonal while it seems incorrct, thanks for your reply. :)
so when I run the code det_mem_viz.ipynb, I met the error:

IndexError Traceback (most recent call last)
in <cell line: 11>()
21
22 curr_data = np.array(row[key])
---> 23 curr_data = curr_data[:num_gens, start:end]
24 curr_data = np.mean(curr_data, axis=0)
25

IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed

I think maybe I reproduced the dataset non_memorized_prompts incorrect,

Question about dataset

Detect memorization
You may first download the memorized prompts dataset from this link and unzip it.

(May I ask where is the link, it seems no in the page),

and in the det_mem_viz.ipynb,
nonmem_data = list(read_jsonlines(f"det_outputs/non_memorized_prompts.jsonl"))

may I ask where is this data file from, thanks

Question about implementation details of the detection method

Dear,
Yuxin.

I have a question about the implementation details of the detection method. I find it seems different from the equation in Sec.3.3 in your paper. In the paper, it is written that the norm of the difference between conditional and unconditional noise is leveraged as the detection metric, ||\epsilon(x,c) - \epsilon(x, \pi)||. However, in lines 190~195 in local_sd_pipeline.py, it seems that the norm of each noise is calculated first, and then delivered to detect_mem.py. (However, in aug_prompt it seems it aligns with the paper.) May I ask you which one would I use as the detection method? Or is there anything that I have misunderstood?

Best,
Chunsan.

yuxinwenrick / diffusion_memorization Goto Github PK

diffusion_memorization's People

Contributors

Stargazers

Watchers

Forkers

diffusion_memorization's Issues

List of non-memorized prompts

SSCD score of memorized samples

Question About Gustavosta/Stable-Diffusion-Prompts

Question about dataset

Question about implementation details of the detection method

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent