Skip to content

Pinned Loading

  1. Intro_to_ML_Safety Intro_to_ML_Safety Public

    63 20

  2. trojan-dc-2023 trojan-dc-2023 Public

    JavaScript 1

Repositories

Showing 10 of 18 repositories
  • cerberus-cluster Public

    HPC cluster code and configurations for running on OCI

    centerforaisafety/cerberus-cluster’s past year of commit activity
    Python 4 UPL-1.0 0 68 0 Updated Nov 9, 2024
  • AISES Public
    centerforaisafety/AISES’s past year of commit activity
    CSS 0 1 0 0 Updated Nov 2, 2024
  • centerforaisafety/cluster-docs’s past year of commit activity
    CSS 0 MIT 2 4 0 Updated Oct 18, 2024
  • safetywashing Public

    Measuring correlations between safety benchmarks and general AI capabilities benchmarks.

    centerforaisafety/safetywashing’s past year of commit activity
    Python 2 MIT 0 0 0 Updated Oct 2, 2024
  • centerforaisafety/course.mlsafety.org’s past year of commit activity
    HTML 3 MIT 0 0 0 Updated Sep 20, 2024
  • forecasting Public

    Forecasting.

    centerforaisafety/forecasting’s past year of commit activity
    TypeScript 28 9 1 0 Updated Sep 11, 2024
  • HarmBench Public

    HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

    centerforaisafety/HarmBench’s past year of commit activity
    Jupyter Notebook 322 MIT 56 19 4 Updated Aug 16, 2024
  • tdc2023-starter-kit Public

    This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

    centerforaisafety/tdc2023-starter-kit’s past year of commit activity
    Python 78 MIT 26 0 0 Updated May 19, 2024
  • wmdp Public

    WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.

    centerforaisafety/wmdp’s past year of commit activity
    Jupyter Notebook 80 MIT 22 5 1 Updated Apr 27, 2024
  • centerforaisafety/safety_challenge’s past year of commit activity
    HTML 0 MIT 0 0 0 Updated Mar 28, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…