Giter Club home page Giter Club logo

ai-red-teaming's Introduction

AI Red Teaming a.k.a. Awesome-LLM-Red-Teaming

About Me Blog

All things specific to Generative AI LLM Red Teaming

My Blog "What the heck is AI Red Teaming" https://bit.ly/ai-red-teaming

As of 11.30.23, I am working hard to build the repos - takes time to review and curate. Appreciate your patience ... Thanks ...
As of 2.1.24, Started transcribing and curating the links from my Omnioutline to this GitHub page ...

Best Practices NIST Survey & Analytical Paper Collection Metrics Benchmarks Datasets Other Repos

Best Practices

Top

Year Title Notes
My Blog "What the heck is AI Red Teaming" A quick general blog
What’s the Difference Between Traditional Red-Teaming and AI Red-Teaming? There is a slight cognitice dissonance between traditional Red Teaming and AI Red Teaming
2023.07 Google's AI Red Team: the ethical hackers making AI safer Good Conceptual Diagrams
2023.10 Best Practices for Securing LLM-Enabled Applications Nvidia
2023.06 NVIDIA AI Red Team: An Introduction
Use Cases
Adversarial Intelligence: Red Teaming Malicious Use Cases for AI
Sensational Press
2023.08 Hackers red-teaming A.I. are ‘breaking stuff left and right,’ but don’t expect quick fixes from DefCon: ‘There are no good guardrails

NIST

Top

All NIST documents, ideas, responses et al

Most probably will split into a Awesome-NIST repository. I have - see Awesome-NIST


Survey & Analytical Papers

Top

Year Title Notes
Survey Papers
2024.01 Gradient-Based Language Model Red Teaming Hot from the press (at least for now! as of Stardate -299100.57)
  • I had written, in my Red Teaming blog, “Tests follow a progressive nature, where a response could lead to another prompt deeper in the knowledge graph on the same topic” Here
  • I was thinking of a prompt hierarchy, this paper does the adaptive Red Teaming by creating new, modified prompts using backprop !!
2024.01 Red Teaming Visual Language Models
2024.01 Red-Teaming for Generative AI: Silver Bullet or Security Theater?
2023.11 Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
2023.08 Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
2023.06 Explore, Establish, Exploit: Red Teaming Language Models from Scratch
2022.09 Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
LLMs vs. LLMs
2022.02 Red Teaming Language Models with Language Models
Analytical Papers
2023.10 Risk Assessment and Statistical Significance in the Age of Foundation Models
Star Trek Stardate Calculator

Metrics

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Benchmarks

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Datasets

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Other Repos

Top

LLM benchmarks (See LLM Evaluation Other Repos

I will start polulating this section

Title Notes
Awesome Security
Awesome Controls Links to various security fraeworks. Last update 4 years ago, still useful
Awesome Infosec A curated list of awesome information security resources

ai-red-teaming's People

Contributors

xsankar avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.