Skip to content

Latest commit

 

History

History
60 lines (49 loc) · 3.92 KB

README.md

File metadata and controls

60 lines (49 loc) · 3.92 KB

Empathetic Dialogue System

About this project

This is a project we developed for the course "Efficient Methods in Machine Learning" at University Hamburg. In this project, we trained a small language model from scratch in our local machine. We experiment the training with different data, and evaluate and compare the results with BLUE, BertScore, GLEU and Perplexity.

The model

Our model is nanoGPT. We experimented with different position embeddings (ROPE, Relative Positional Embedding, Absolute Positional Embeddings).

Please check our model readme for the code and detailed information.

The dataset

We use the Empathetic Dialogues (Facebook AI) 25k dataset and agumented it with data generated by ChatGPT-4o-mini.

Please check our dataset readme for the code and detailed information.

Data Description Trained Model
59k_eachconv_eot Under no_additional_tag folder.
Facebook dataset with endOfText inserted after every 2 sentences.
3 modified models
single_conversation
59k_wholeconv_eot Under no_additional_tag folder.
Facebook dataset with endOfText inserted at the end of the whole conversation.
whole_conversation
59k_eachconv_eot_with_context Under context_tag folder.
Facebook dataset with endOfText
After every 2 sentences, including context.
single_conversation_withcontext
59k_eachconv_eot_with_emotion Under emotion_file folder.>Facebook dataset with endOfText
After every 2 sentences, including emotion.
single_conversation_withemotion
with_gpt_data Under with_gpt_data folder.
Based on the question in 59k_eachconv_eot, we generated the answer from ChatGPT 4omini, therefore we have 118k pairs of conversation
single_conversation_withGPTdata_bs256, single_conversation_withGPTdata

Run the project

Environment Setup

The IDE we use during development is mainly VSCode.

python -m venv env
source env/bin/activate
pip install -r requirements.txt
export PYTHONPATH=/Users/Project-ML/src (copy absolute path to the src 
folder in your local machine, if you have error like ` No module named 'nanoGPT'`, repeat this step)

Hugging face space

The trained model can be qurried here in the hugging face space

You can also access it locally by first getting a hugging face token and running:

export HF_TOKEN="HF_XXXXXXXXXXXXX"
cd src/app
gradio App.py

Locally train the model

Check the section in our model readme.

Evaluate the model

Check the section in our model evaluation.