AI Secure

DecodingTrust Public

A Comprehensive Assessment of Trustworthiness in GPT Models

Python 267 57

DBA Public

DBA: Distributed Backdoor Attacks against Federated Learning (ICLR 2020)

Python 181 45

This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular datasets and paper categorization.

100 10

VeriGauge Public

A united toolbox for running major robustness verification approaches for DNNs. [S&P 2023]

C 88 7

InfoBERT Public

[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Python 84 7

AgentPoison Public

[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

Python 78 7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Secure

Popular repositories Loading

Repositories

People

Top languages

Most used topics