Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BC] Add an interactive only mode for env.py #377

Merged
merged 26 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a06d452
Combine two tf_agents policies with timestep spec
tvmarino Sep 3, 2024
15876f7
Merge branch 'main' of https://github.com/google/ml-compiler-opt
tvmarino Sep 3, 2024
9bc8c05
Added licence.
tvmarino Sep 6, 2024
86e4d12
yapf . -ir
tvmarino Sep 6, 2024
27dee69
yapf . -ir
tvmarino Sep 6, 2024
47f5efc
Fixed pylint errors.
tvmarino Sep 6, 2024
bf12cee
Merge branch 'policy_combiner' of https://github.com/tvmarino/ml-comp…
tvmarino Sep 6, 2024
f5b6b6f
yapf . -ir
tvmarino Sep 6, 2024
35d9e8c
Fixed super without arguments pylint error.
tvmarino Sep 6, 2024
5d6783d
yapf . -ir
tvmarino Sep 6, 2024
7997f14
Fixing pytype annotations.
tvmarino Sep 6, 2024
59d3677
Fixed pytype errors. Addressed comments.
tvmarino Sep 6, 2024
6d8c0c7
Addressed comments.
tvmarino Sep 9, 2024
3b0cefd
Resolved _distribution and common.gin comments.
tvmarino Sep 9, 2024
78460ce
Fixed Aiden's nits.
tvmarino Sep 9, 2024
5b5d67b
Merge branch 'google:main' into behavior_cloning
tvmarino Sep 30, 2024
6be7186
Patch to env.py and compilation_runner.py which adds working_dir to
tvmarino Oct 1, 2024
6342dda
Fixed comments.
tvmarino Oct 1, 2024
3082ae7
Fixed pylint.
tvmarino Oct 1, 2024
2e26243
Fixed a nit
tvmarino Oct 2, 2024
56fa72a
Added interactive only mode for env.py which
tvmarino Oct 2, 2024
463d813
Merge branch 'google:main' into behavior_cloning
tvmarino Oct 2, 2024
5568aaf
Improved _get_clang_generator documentation in env.py.
tvmarino Oct 2, 2024
0c20688
Merge branch 'behavior_cloning' of https://github.com/tvmarino/ml-com…
tvmarino Oct 2, 2024
ac307fc
Address a nit.
tvmarino Oct 2, 2024
07a77ce
Fixed pylint.
tvmarino Oct 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 17 additions & 8 deletions compiler_opt/rl/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,18 +290,22 @@ def _get_scores() -> dict[str, float]:
def _get_clang_generator(
clang_path: str,
task_type: Type[MLGOTask],
interactive_only: bool = False,
) -> Generator[Optional[Tuple[ClangProcess, InteractiveClang]],
Optional[corpus.LoadedModuleSpec], None]:
"""Returns a generator for creating InteractiveClang objects.

TODO: fix this docstring
"""Returns a tuple of generators for creating InteractiveClang objects.

Args:
clang_path: Path to the clang binary to use within InteractiveClang.
task_type: Type of the MLGO task to use.
interactive_only: If set to true the returned tuple of generators is
iclang, iclang instead of iclang, clang

Returns:
The generator for InteractiveClang objects.
A tuple of generators created with clang_session. First argument of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the return is a generator of tuples.

the tuple is always an interactive clang session. The second argumnet
is a default clang session if interactive_only is False and otherwise
the exact same interactive clang session object as the first argument.
"""
while True:
# The following line should be type-hinted as follows:
Expand All @@ -311,9 +315,12 @@ def _get_clang_generator(
module = yield
with clang_session(
clang_path, module, task_type, interactive=True) as iclang:
with clang_session(
clang_path, module, task_type, interactive=False) as clang:
yield iclang, clang
if interactive_only:
yield iclang, iclang
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we return iclang here, i.e. not a tuple, and detect that on the receiving end? or (iclang, None)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relative reward is computed, that is, the compute_relative_rewards function of env.py always expects two scores. I found this to be the cleanest change. Returning only iclang and None will end up being identical but would require multiple changes and I didn't want to break anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - can you update the documentation string explaining this, too? (i.e. it's not quite "interactive_only: If set to true only iclang is yielded" - looking at the docstring, the return isn't quite well specified actually, what the generator generates is what's interesting - could you basically make the doc "right"?)

else:
with clang_session(
clang_path, module, task_type, interactive=False) as clang:
yield iclang, clang


class MLGOEnvironmentBase:
Expand All @@ -332,8 +339,10 @@ def __init__(
task_type: Type[MLGOTask],
obs_spec,
action_spec,
interactive_only: bool = False,
):
self._clang_generator = _get_clang_generator(clang_path, task_type)
self._clang_generator = _get_clang_generator(
clang_path, task_type, interactive_only=interactive_only)
self._obs_spec = obs_spec
self._action_spec = action_spec

Expand Down
27 changes: 27 additions & 0 deletions compiler_opt/rl/env_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,33 @@ def test_env(self, mock_popen):

step = test_env.step(np.array([1], dtype=np.int64))
self.assertEqual(step.step_type, env.StepType.LAST)
self.assertNotEqual(test_env._iclang, test_env._clang) # pylint: disable=protected-access

@mock.patch('subprocess.Popen')
def test_env_interactive_only(self, mock_popen):
mock_popen.side_effect = mock_interactive_clang

test_env = env.MLGOEnvironmentBase(
clang_path=_CLANG_PATH,
task_type=MockTask,
obs_spec={},
action_spec={},
interactive_only=True,
)

for env_itr in range(3):
del env_itr
step = test_env.reset(_MOCK_MODULE)
self.assertEqual(step.step_type, env.StepType.FIRST)

for step_itr in range(_NUM_STEPS - 1):
del step_itr
step = test_env.step(np.array([1], dtype=np.int64))
self.assertEqual(step.step_type, env.StepType.MID)

step = test_env.step(np.array([1], dtype=np.int64))
self.assertEqual(step.step_type, env.StepType.LAST)
self.assertEqual(test_env._iclang, test_env._clang) # pylint: disable=protected-access


if __name__ == '__main__':
Expand Down