Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 0 by following the Get Started "Run ragas metrics for evaluating RAG" #1770

Open
parkerzf opened this issue Dec 18, 2024 · 1 comment
Labels
bug Something isn't working module-metrics this is part of metrics module

Comments

@parkerzf
Copy link

parkerzf commented Dec 18, 2024

[*] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
The get start code about RAG evaluation has runtime exceptions. KeyError: 0. It is because the faithfulness is nan for some evaluation cases. If I remove the faithfulness metrics. It runs smoothly.

Ragas version: 0.28.0
Python version: python 3.9

Code to Reproduce
Share code to reproduce the issue
The code is in the https://docs.ragas.io/en/stable/getstarted/rag_evaluation/

Error trace
while running this line: results = evaluate(dataset=eval_dataset, metrics=metrics), I get the exception:

outputs={'faithfulness': nan},
KeyError: 0

Expected behavior
A clear and concise description of what you expected to happen.
Output the top evaluations results

Additional context
Add any other context about the problem here.

@parkerzf parkerzf added the bug Something isn't working label Dec 18, 2024
@dosubot dosubot bot added the module-metrics this is part of metrics module label Dec 18, 2024
@trish11953
Copy link

trish11953 commented Dec 18, 2024

Similar Error

`
Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt claim_decomposition_prompt failed to parse output: The output parser failed to parse the output including retries.
Exception raised in Job[17]: RagasOutputParserException(The output parser failed to parse the output including retries.)
/home/trisha/Desktop/.venv/lib/python3.10/site-packages/ragas/metrics/_answer_similarity.py:88: RuntimeWarning: invalid

value encountered in divide
embedding_2_normalized = embedding_2 / norms_2
..........
"output": prompt_trace.outputs.get("output", {})[0],
KeyError: 0
`
Error trace:
result = evaluate(dataset=dataset, metrics=metrics)

Code to reproduce the error:

`
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity
from ragas import evaluate
import pandas as pd
from datasets import Dataset
from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

langchain_llm = ChatOllama(model="llama3")
langchain_embeddings = OllamaEmbeddings(model="llama3")
ragas_llm = LangchainLLMWrapper(langchain_llm=langchain_llm)
ragas_emb = LangchainEmbeddingsWrapper(embeddings=langchain_embeddings)

file_path = 'rag_dataset.csv'
df = pd.read_csv(file_path, encoding='utf-8')

user_input = df_cleaned.iloc[:, 0].tolist()
response = df_cleaned.iloc[:, 1].tolist()
retrieved_contexts = df_cleaned.iloc[:, 2].tolist()
reference_contexts = df_cleaned.iloc[:, 2].tolist()
ground_truths = df_cleaned.iloc[:, 3].tolist()
reference = df_cleaned.iloc[:, 4].tolist()

retrieved_contexts = [[item] for item in retrieved_contexts]
reference_contexts = [[item] for item in reference_contexts]

data = {
"question": user_input,
"answer": response,
"retrieved_contexts": retrieved_contexts,
"reference_contexts": reference_contexts,
"reference": reference
}

dataset = Dataset.from_dict(data)

metrics = [
LLMContextRecall(llm=ragas_llm),
FactualCorrectness(llm=ragas_llm),
Faithfulness(llm=ragas_llm),
SemanticSimilarity(embeddings=ragas_emb)
]

result = evaluate(dataset=dataset, metrics=metrics)
print(result)
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module-metrics this is part of metrics module
Projects
None yet
2 participants