Testset generation feedback #1568

tuan3w · 2024-10-24T03:18:15Z

Hello,

I would like to open this issue to discuss tips and guidance related to test set generation. If you don't think this is the right place, feel free to close this issue.

Recently, I tested with Ragas 0.2, making some modifications, and generated tests for Vietnamese. I do very simple setup. I use PyPDFLoader to load pdf docs.

Here are some observations:

It generates questions for short, unattractive contexts. Perhaps some rule-based filters could help address this issue.

I noticed that short, simple questions were generated even for long contexts. Tweaking the prompt slightly might improve the quality of the questions.

3. I feel that the generated questions are too generic. Perhaps adding a global theme or topic could enhance the question generation process.
4. Ultimately, I believe that having the ability to use custom prompts to control the question generation would be very beneficial.

Thanks

The text was updated successfully, but these errors were encountered:

shahules786 · 2024-10-25T04:56:34Z

Hey @tuan3w
thanks for your feedback. Can you also share how many documents you used and also describe the nature of the data?
Some of these are documentation related issues, for example item 4 - you can already customise prompts using set_prompts and get_prompts method as in here but it's not documented for testset.
Others would need bit more polishing. Test generation v3 is very capable, but it surely needs more polishing to make it better.

Also I just created #1577 to track feedbacks for it. Feel free to engage in conversations, and post your feedbacks there.

ableiweiss · 2024-11-18T08:22:55Z

@shahules786 can you share an example of how this would be done for testset? I use set_prompts to modify the prompt for SpecificQuerySynthesizer and verify that it changes the instruction and examples, but when I generate it doesn't have an effect.

tuan3w added the question Further information is requested label Oct 24, 2024

dosubot bot added the module-testsetgen Module testset generation label Oct 24, 2024

shahules786 mentioned this issue Oct 25, 2024

[R-310] Test set generation improvements #1577

Open

10 tasks

shahules786 changed the title ~~Testset generation v3 discussion~~ Testset generation feedback Oct 25, 2024

shahules786 self-assigned this Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testset generation feedback #1568

Testset generation feedback #1568

tuan3w commented Oct 24, 2024 •

edited

Loading

shahules786 commented Oct 25, 2024

ableiweiss commented Nov 18, 2024 •

edited

Loading

Testset generation feedback #1568

Testset generation feedback #1568

Comments

tuan3w commented Oct 24, 2024 • edited Loading

shahules786 commented Oct 25, 2024

ableiweiss commented Nov 18, 2024 • edited Loading

tuan3w commented Oct 24, 2024 •

edited

Loading

ableiweiss commented Nov 18, 2024 •

edited

Loading