You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pySBD last commit is 3 years ago, which I also have question why prefer use that library?
My ultimate question is: How to add language support which don't supported by pySBD (and not supported by RAGAS)?
I see that the list is too limited, not a single language from Southeast Asia is supported.
Additional context
If I can extend the language support which don't "natively" supported by RAGAS, where I can find the example to create an Adapter Language?
Thank you!
The text was updated successfully, but these errors were encountered:
hey @yusufsyaifudin thanks for sharing this - which language are you are you planning to use? other that pySBD which other tools do you work with that have support which you mentioned?
@shahules786 should be able to provide you better information too
I am work with Bahasa Indonesia and have tried to run the RAGAs with default settings (I assume it English) in three proprietary model: claude-3-haiku-20240307, claude-3-5-haiku-20241022 and claude-3-5-sonnet-20241022.
The claude-3-haiku-20240307 always return the faithfulness score to 1.0 (I only test with two data) which I can confirm that it should be near to 0. The other two models return 0.0, at this point I starting to think that maybe it just because the Haiku old version is "bad" at reasoning.
But, I think by using the same language in the prompt for testing (Bahasa Indonesia in my case), probably it would be have better reasoning.
other that pySBD which other tools do you work with that have support which you mentioned?
Actually I don't know any alternative, maybe we still in the state that none package supports all language for sentence boundary extractor.
But, imho, if we can create some "abstraction" regarding the sentence segmentation and prompt, we can achieve multi-language support easily? Probably using nltk, or other package.
If I want to extend or create abstraction for this, which file and line of code as the the starting point that I can read? Maybe @shahules786 can help me to point this out.
[x] I checked the documentation and related resources and couldn't find an answer to my question.
Your Question
I want to use RAGAS as my RAG evaluation framework, but I cannot find supported language other than
RAGAS_SUPPORTED_LANGUAGE_CODES
in this line https://github.com/explodinggradients/ragas/blob/v0.2.8/src/ragas/metrics/base.py#L707Which after tracing the code, it come from here:
The pySBD last commit is 3 years ago, which I also have question why prefer use that library?
My ultimate question is: How to add language support which don't supported by pySBD (and not supported by RAGAS)?
I see that the list is too limited, not a single language from Southeast Asia is supported.
Additional context
If I can extend the language support which don't "natively" supported by RAGAS, where I can find the example to create an Adapter Language?
Thank you!
The text was updated successfully, but these errors were encountered: