Skip to content
View BillChan226's full-sized avatar
🐝
learning
🐝
learning

Highlights

  • Pro

Organizations

@AI-secure

Block or report BillChan226

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
BillChan226/README.md

Hi there, I'm Zhaorun Personal Website πŸ‘‹

Connect with me:

HZ HZ | GoogleScholar HZ | Twitter


πŸ–οΈ My Research Interests

  • Trustworthy deployment and safe interactions with large foundation models and agents from both a theoretical and empirical perspective.
  • enhancing LLM's trustworthiness via retrieval-augmented generation (RAG) and robustness certificates for hallucination, alignment, jailbreaks and privacy.

GitHub stats Language Stats

Pinned Loading

  1. HALC HALC Public

    [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"

    Python 73 1

  2. AI-secure/AgentPoison AI-secure/AgentPoison Public

    [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

    Python 78 7

  3. MJ-Bench/MJ-Bench MJ-Bench/MJ-Bench Public

    Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"

    Jupyter Notebook 40 5

  4. SafeWatch SafeWatch Public

    Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations"

    4