This repo helps visualize the results of sql-eval better, and checking where the model is performing suboptimally and what are the tokens where it is losing confidence.
- Make sure that this repo is contained within the
sql-eval
folder cd
into the folder containing this repo, and install dependencies withnpm i
- Create a
.env
file, with the same variables as.env.template
. Then replace those with your own variable names. - Run the repo with
npm run dev
- Profit!
Note: Currently, only the custom Defog implementation of the vLLM API server is supported. We hope to expand this to other runners in the future.
Run sql-eval using the vLLM API runner, and with the --logprobs
command line parameter enabled, like below.
python main.py \
-db postgres \
-q "data/questions_gen_postgres.csv" "data/instruct_basic_postgres.csv" "data/instruct_advanced_postgres.csv" \
-o results/classic.csv results/basic.csv results/advanced.csv \
-g api \
-b 1 \
-f prompts/prompt.md \
--api_url "YOUR_API_URL" \
--api_type "vllm" \
-p 20 \
-c 0 \
--logprobs
- Let users execute the SQL queries they see with a single click, and see the resulting tables
- Compare the results of 2 different runs, instead of just looking at results from a single run
- Let users manually mark some squares as "almost correct", or "partially correct" in the UI, in order to differentiate responses that are almost correct with those that are very off