Giter Club home page Giter Club logo

silentbaddiffusion's Introduction

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline (ICML 2024)

๐Ÿ”” News

๐Ÿš€ [2024-06-13]: ICML 2024 Oral Presentation!

๐Ÿ”ฅ [2024-07-18]: Code Released!


If you find our project helpful, please star our repo on GitHub โญ to stay updated with our latest features and improvements!

๐Ÿ“ ToDo

  • Release the SilentBadDiffusion Code:

    • Make the SilentBadDiffusion code publicly available.
  • Detailed Step-by-Step Instructions:

    • Provide comprehensive, step-by-step instructions for setting up and running all aspects of the project to ensure easy reproducibility.
  • Result Collector:

    • Develop a script to automatically collect and organize all experimental results for easier analysis and comparison.
  • t-SNE Visualization:

    • Include a module to perform t-SNE visualization of the results, making it simpler to reproduce the visualizations presented in the paper.
  • Extended Documentation:

    • Expand the documentation to cover advanced usage scenarios and troubleshooting tips.
  • Performance Metrics:

    • Implement additional metrics to evaluate the model performance more thoroughly.

๐Ÿ“– Overview

The commercialization of text-to-image diffusion models (DMs) brings forth potential copyright concerns. Despite numerous attempts to protect DMs from copyright issues, the vulnerabilities of these solutions are underexplored. In this study, we formalized the Copyright Infringement Attack on generative AI models and proposed a backdoor attack method, SilentBadDiffusion, to induce copyright infringement without requiring access to or control over training processes.


ยฉ๏ธ Copyright Infringement Attack

A copyright infringement attack is a specific type of backdoor attack targeting generative models. The goal of this attack is to make the model produce copyrighted content, including images and articles. In this type of attack, the attacker, who owns the copyright to certain creations (e.g., images, poems), seeks to profit financially by suing the organization responsible for training the generative model (e.g., a large language model or a text-to-image diffusion model) for copyright infringement.


๐ŸŒŸ SilentBadDiffusion

Our method strategically embeds connections between pieces of copyrighted information and text references in poisoning data while carefully dispersing that information, making the poisoning data inconspicuous when integrated into a clean dataset. Our experiments show the stealth and efficacy of the poisoning data. When given specific text prompts, DMs trained with a poisoning ratio of 0.20% can produce copyrighted images. Additionally, the results reveal that the more sophisticated the DMs are, the easier the success of the attack becomes.

These findings underline potential pitfalls in the prevailing copyright protection strategies and underscore the necessity for increased scrutiny to prevent the misuse of DMs.

๐Ÿ”ง Installation

  1. Install required packages:

    pip install xformers==0.0.23 torchvision==0.16.1
    pip install -r requirements.txt
  2. Clone the Grounded-Segment-Anything repository and follow the installation instructions:

    git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git

    Alternatively, you can try the following steps:

    cd Grounded-Segment-Anything
    export AM_I_DOCKER=False
    export BUILD_WITH_CUDA=True
    export CUDA_HOME=/path/to/cuda-11.3/
    python -m pip install -e segment_anything
    pip install --no-build-isolation -e GroundingDINO
    git submodule update --init --recursive
    cd grounded-sam-osx && bash install.sh
    pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel
  3. Download the required checkpoints into the checkpoints folder:

    mkdir checkpoints
    cd checkpoints
    wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth
    wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
    wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
    wget https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_disc_mixup.torchscript.pt
    wget https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_disc_large.torchscript.pt
    wget https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_imagenet_mixup.torchscript.pt
  4. Set your OpenAI API key:

    export OPENAI_API_KEY='yourkey'

๐Ÿงพ Usage

  1. Download the Datasets:

    • Run download.py located in the ./datasets folder to download necessary datasets.
  2. Generate Poisoning Data:

    • Execute src/poisoning_data_generation.py to create the poisoning data required for the experiment.
  3. Run the Attack Experiment:

    • Use src/target_model_training.py to carry out the attack experiment.
    • Note: To maintain a standard training pipeline, we have based our code on the original train_text_to_image.py from diffusers 0.27.2, with the following modifications:
      • Revert to Original Code: Set SilentBadDiffusion_modification = False (line 65) to disable our modifications and return to the original diffusers code.
      • Added Code Snippets:
        1. Loading Data (Lines 490-527): Additional code for loading data.
        2. Visualization (Lines 828-840): Added visualization steps.
        3. Saving Model (Lines 870-893): Code for saving the trained model.

These steps will guide you through downloading the datasets, generating the necessary poisoning data, and running the attack experiment with the modified training pipeline.

โœ‰๏ธ Contact

๐Ÿ–‹๏ธ Citation

BibTeX:

@article{wang2024stronger,
  title={The stronger the diffusion model, the easier the backdoor: Data poisoning to induce copyright breaches without adjusting finetuning pipeline},
  author={Wang, Haonan and Shen, Qianli and Tong, Yao and Zhang, Yang and Kawaguchi, Kenji},
  journal={arXiv preprint arXiv:2401.04136},
  year={2024}
}

silentbaddiffusion's People

Contributors

haonan3 avatar

Stargazers

Xiaosen Zheng avatar  avatar Longhui Yu avatar Yao Tong avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.