Pinned Loading
Repositories
Showing 10 of 18 repositories
- cluster-docs Public
centerforaisafety/cluster-docs’s past year of commit activity - safetywashing Public
Measuring correlations between safety benchmarks and general AI capabilities benchmarks.
centerforaisafety/safetywashing’s past year of commit activity - course.mlsafety.org Public
centerforaisafety/course.mlsafety.org’s past year of commit activity - tdc2023-starter-kit Public
This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.
centerforaisafety/tdc2023-starter-kit’s past year of commit activity - wmdp Public
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.
centerforaisafety/wmdp’s past year of commit activity - safety_challenge Public
centerforaisafety/safety_challenge’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…