-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat:omegaPRM reproduced by openR: Process-supervision Data Generation(PRM) #1280
base: master
Are you sure you want to change the base?
Conversation
6ad5d05
to
12ec2ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zjrwtx , could we refactor this PR to move the core modules under camel folder? Just like how you did to the O1 data gen PR
yeah sure thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow Camel code style by adding comments to all the classes and functions. For functions, please add types to the parameters and indicate the returned type as well. Thanks!
|
||
|
||
def load_config(config_path): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update all the comments format by using r""".
load_dotenv() | ||
|
||
|
||
class LM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comment to explain the purpose of the class by following the same format of any existing camel class.
For each step: | ||
1. Write down what you're calculating | ||
2. Show the calculation | ||
3. Explain the result | ||
Always show your work, even for simple calculations. | ||
End your solution with the final numerical answer.''', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please format the code by using intents.
Always show your work, even for simple calculations. | ||
End your solution with the final numerical answer.''', | ||
model=self.model, | ||
message_window_size=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you limit the window size to 10?
Current solution steps: | ||
{partial_answer} | ||
Continue the solution, showing all steps and calculations. | ||
Make sure to explain each step:""" | ||
else: | ||
prompt = f"""Problem: {question} | ||
Please solve this step by step, showing all calculations and | ||
explaining each step. | ||
Remember to: | ||
1. Break down the problem | ||
2. Show all calculations | ||
3. Explain each step | ||
4. End with the final numerical answer.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please format the code using intents.
from omegaprm_v2 import OmegaPRMV2 | ||
|
||
|
||
class Node: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments for the class and all the functions.
Please fix all the failing tests. You could run |
def main(): | ||
# Direct configuration instead of loading from yaml | ||
config = { | ||
'input': {'json_file_path': 'example_problems.json'}, | ||
'output': { | ||
'file_prefix': 'example', | ||
'log_file_path': 'example_processing.log', | ||
}, | ||
'processing': { | ||
'initial_rollouts': 30, # 增加初始rollouts数量 | ||
'num_rollouts': 25, # 增加每次迭代的rollouts数量 | ||
'max_iterations': 150, # 增加最大迭代次数 | ||
}, | ||
'model': { | ||
'model_type': 'camel', | ||
'model_name': 'deepseek-chat', | ||
'model_args': { | ||
'max_tokens': 300, # 增加最大token数 | ||
'temperature_range': [0.6, 0.9], # 调整温度范围,增加多样性 | ||
}, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Yifeng please comment in English.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yifeng! Shall we include a reference for code migrated from the openR repo?
@willshang76 @harryeqs thanks!i will refactor this |
self.latest_id_per_rollout[rollout_key] = unique_id | ||
heapq.heappush(self.heap, entry) | ||
|
||
def pop(self) -> Tuple[Optional[State], Optional[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it checks whether each unique_id in the heap is still valid, which might increase the computational overhead when repeatedly accessing and modifying the heap. Any chance we could optimize this? Maybe instead of deleting invalid entries immediately, mark them as "invalid" and skip them during the pop() process? We could probably also clean up periodically invalidated entries from entry_finder
to prevent memory bloat?
…models||first experimental version
Description
add example:omegaPRM reproduced by openR: Process-supervision Data Generation(PRM)
reference:https://github.com/openreasoner/openr/tree/main/data
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax
close #15213
if this solves the issue #15213Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!