Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testset generation feedback #1568

Open
tuan3w opened this issue Oct 24, 2024 · 2 comments
Open

Testset generation feedback #1568

tuan3w opened this issue Oct 24, 2024 · 2 comments
Assignees
Labels
module-testsetgen Module testset generation question Further information is requested

Comments

@tuan3w
Copy link

tuan3w commented Oct 24, 2024

Hello,

I would like to open this issue to discuss tips and guidance related to test set generation. If you don't think this is the right place, feel free to close this issue.

Recently, I tested with Ragas 0.2, making some modifications, and generated tests for Vietnamese. I do very simple setup. I use PyPDFLoader to load pdf docs.

Here are some observations:

  1. It generates questions for short, unattractive contexts. Perhaps some rule-based filters could help address this issue.

image
image

  1. I noticed that short, simple questions were generated even for long contexts. Tweaking the prompt slightly might improve the quality of the questions.

image
3. I feel that the generated questions are too generic. Perhaps adding a global theme or topic could enhance the question generation process.
4. Ultimately, I believe that having the ability to use custom prompts to control the question generation would be very beneficial.

Thanks

@tuan3w tuan3w added the question Further information is requested label Oct 24, 2024
@dosubot dosubot bot added the module-testsetgen Module testset generation label Oct 24, 2024
@shahules786
Copy link
Member

Hey @tuan3w
thanks for your feedback. Can you also share how many documents you used and also describe the nature of the data?
Some of these are documentation related issues, for example item 4 - you can already customise prompts using set_prompts and get_prompts method as in here but it's not documented for testset.
Others would need bit more polishing. Test generation v3 is very capable, but it surely needs more polishing to make it better.

Also I just created #1577 to track feedbacks for it. Feel free to engage in conversations, and post your feedbacks there.

@shahules786 shahules786 changed the title Testset generation v3 discussion Testset generation feedback Oct 25, 2024
@shahules786 shahules786 self-assigned this Oct 25, 2024
@ableiweiss
Copy link

ableiweiss commented Nov 18, 2024

@shahules786 can you share an example of how this would be done for testset? I use set_prompts to modify the prompt for SpecificQuerySynthesizer and verify that it changes the instruction and examples, but when I generate it doesn't have an effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module-testsetgen Module testset generation question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants