DCAPI Chat Evaluation

This repo contains two notebooks to help run evaluations of DCAPI's chat functionality.

PrepareEvaluationData - Takes a list of questions and gets answers from DCAPI. Outputs spreadsheet to be used as input for for the ScoreAnswers notebook (or for Azure evaluations).
ScoreAnswers - Takes a spreadsheet of question, answer, and ground_truths produced from PrepareEvaluationData and scores the responses using AWS Bedrock.

AWS and DCAPI authorization

PrepareEvaluationData - requires you to obtain a DCAPI authorization token and Setup Environment Variables.
ScoreAnswers requires you to be logged in as either a staging or production user (login in your terminal before launching your Jupyter notebook)

Environment Setup (optional)

Python virtual environments can be a great way to bundle a collection of libraries for a specific research area or project and keep it separate from other activities. There are two steps: First, you must create the virtual environment; second, you must install the virtual environment as a Jupyter kernel.

Here are some resources describing how to do this:

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
input_files		input_files
output_files		output_files
.gitignore		.gitignore
PrepareEvaluationData.ipynb		PrepareEvaluationData.ipynb
README.md		README.md
ScoreAnswers.ipynb		ScoreAnswers.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DCAPI Chat Evaluation

AWS and DCAPI authorization

Environment Setup (optional)

About

Releases

Packages

Languages

nulib/chat-eval

Folders and files

Latest commit

History

Repository files navigation

DCAPI Chat Evaluation

AWS and DCAPI authorization

Environment Setup (optional)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages