Reinforcement Learning with Human Feedback (RLHF): Bridging the Gap between AI and Human Expertise

This is a simple implementation of RLHF based on the paper "Learning to summarize from human feedback" for Toronto Metropolitan University , DS8008 Natural Language Processing Course as part of the its Data Science Master's (MSc) program.

Data

The original raw source of the data used for this experiment comes from Reddit Posts from the below links,

For this experiment due to infrastructure limitations we used the small version of the preprocessed data from Google,

Preference dataset: gs://vertex-ai/generative-ai/rlhf/text_small/summarize_from_feedback_tfds/comparisons/train/*.jsonl(Stored as datasets/preference_dataset.jsonl)
Prompt dataset: gs://vertex-ai/generative-ai/rlhf/text_small/reddit_tfds/train/*.jsonl (Stored as datasets/prompt_dataset.jsonl)
Test/Validation dataset: gs://vertex-ai/generative-ai/rlhf/text_small/reddit_tfds/val/*.jsonl (Stored as datasets/validate_dataset.jsonl)

These datasets are downloaded and stored under datasets/ folder.

Setup

This project is not implemented to run on local machine.It is implemented for Google Cloud Platform(GCP),specify to run in Vertex AI.Follow the below steps to execute this project.

Place the GCP key file under keys/ folder(This is required to authenticate with GCP Project where we want to run this experiment)
Open the nlp_rlhf_project.ipynb file and follow the Instructions.
Please note running this notebook will incur cost.(Please budget approx 400-600CAD) and will take approx 1 day 4 hours to complete the pipeline run based on the current settings.

References

Learning to summarize from human feedback link(Base Paper)
Secrets of RLHF in Large Language Models, Secrets of RLHF in Large Language Models Part I: PPO link
Secrets of RLHF in Large Language Models, Part II: Reward Modeling link
Tutorial Reinforment Learning from Human Feedback(Code Implementation) link
Google Cloud RLHFlink
Wangchunshu Zhou, Ke Xu, "Learning to compare for better training and evaluation of open domain natural language generation models", 2020, link
Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano Geoffrey Irving, "Fine-tuning language models from human preferences", 2020, link

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
dataset		dataset
helpers		helpers
pipeline		pipeline
reinforcer_logs		reinforcer_logs
reports		reports
reward_logs		reward_logs
.gitignore		.gitignore
README.md		README.md
nlp_rlhf_project.ipynb		nlp_rlhf_project.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning with Human Feedback (RLHF): Bridging the Gap between AI and Human Expertise

Data

Setup

References

About

Releases

Packages

Languages

Amarpreet3/nlp-project-rlhf

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning with Human Feedback (RLHF): Bridging the Gap between AI and Human Expertise

Data

Setup

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages