Grad student in NLP. I make contributions here sporadically.
Pinned Loading
-
ContextualAI/HALOs
ContextualAI/HALOs PublicA library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
-
dataset_difficulty
dataset_difficulty Public"Understanding Dataset Difficulty with V-Usable Information" (ICML 2022, outstanding paper)
-
huggingface/trl
huggingface/trl PublicTrain transformer language models with reinforcement learning.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.