-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limiting Description and Commit Messages Length #187
Conversation
/describe |
PR Analysis
PR Feedback
How to use
|
pr_agent/algo/pr_processing.py
Outdated
""" | ||
# We'll estimate the number of tokens by hueristically assuming 2.5 tokens per word | ||
words = re.finditer(r'\S+', text) | ||
max_words = max_tokens // 2.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i really don't like the 2.5 factor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
far better is to encode the original, and estimate the tokens-to-chars ratio
pr_agent/tools/pr_reviewer.py
Outdated
@@ -62,6 +62,8 @@ def __init__(self, pr_url: str, is_answer: bool = False, args: list = None): | |||
"extra_instructions": get_settings().pr_reviewer.extra_instructions, | |||
"commit_messages_str": self.git_provider.get_commit_messages(), | |||
} | |||
self.vars["description"] = clip_tokens(self.vars["description"], 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only for reviewer ? what about other tools
better to edit 'self.git_provider.get_pr_description(),'
'self.git_provider.get_commit_messages()'
Limiting Description and Commit Messages Length
PR Type:
Enhancement
PR Description:
This PR introduces a method to limit the length of the description and commit messages to a maximum number of tokens. This is achieved by adding a new function 'clip_tokens' that clips the number of tokens in a string to a maximum number. The function is then used to limit the length of 'description' and 'commit_messages_str' in the PR reviewer.
PR Main Files Walkthrough:
pr_agent/algo/pr_processing.py
: A new function 'clip_tokens' is added. This function takes a string and a maximum number of tokens as input and returns the string clipped to the maximum number of tokens.pr_agent/tools/pr_reviewer.py
: The 'clip_tokens' function is imported and used to limit the length of 'description' and 'commit_messages_str'.