Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent register_launch_plan from re-registering already registered workflow #3049

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wild-endeavor
Copy link
Contributor

@wild-endeavor wild-endeavor commented Jan 11, 2025

Tracking issue

Linking this to flyteorg/flyte#6062 which is related.

Why are the changes needed?

Please see comment in the linked issue, copied here:

Assume in the past my_wf has already been registered with version ksD2LBYDu5PrNtNnnpd95Q

Now if you try to do:

from wfs import my_wf

lp = LaunchPlan.get_or_create(workflow=my_wf, name="other_lp_for_hello2", default_inputs={})
remote = FlyteRemote(...)
remote.register_launch_plan(entity=lp, version="ksD2LBYDu5PrNtNnnpd95Q")
...

The issue is that register_launch_plan will currently try to register the underlying workflow again. This would be fine, except that the serialization settings that are used may be completely different. Even if it's exactly the same, this is a waste of time because the best case scenario is that Admin returns already exists for the workflow and every underlying task (and subwf, etc.).

We should check first to see if the workflow already exists with that version, and only register the launch plan if the workflow already exists.

Skipping re-registration of the workflow does mean that users who were relying on this re-registration (and were somehow reconstructing serialization settings to be identical) to detect changes in workflow/task structure will no longer see errors, but that seems too far-fetched.

What changes were proposed in this pull request?

  • Check if the underlying workflow exists, and if so only register the launch plan.
  • Also pass through an argument (create_default_launchplan) in _serialize_and_register which for some reason wasn't used. Just a nit, default was always just True.

How was this patch tested?

Tested with local sandbox.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR streamlines the launch plan registration process by implementing checks to prevent redundant workflow registrations. The changes optimize the registration flow by adding existence verification and proper handling of the create_default_launchplan parameter. The implementation focuses on efficiency and maintainability through improved registration logic.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 1

…w if it already exists

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@flyte-bot
Copy link
Contributor

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - The AI Code Review Agent skipped reviewing this change because it is configured to exclude certain pull requests based on the source/target branch or the pull request status. You can change the settings here, or contact the agent instance creator at eduardo@union.ai.

Copy link

codecov bot commented Jan 11, 2025

Codecov Report

Attention: Patch coverage is 45.45455% with 6 lines in your changes missing coverage. Please review.

Project coverage is 73.43%. Comparing base (dfa8f04) to head (2641b3e).

Files with missing lines Patch % Lines
flytekit/remote/remote.py 45.45% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3049       +/-   ##
===========================================
+ Coverage   47.23%   73.43%   +26.19%     
===========================================
  Files         202      202               
  Lines       21355    21365       +10     
  Branches     2744     2746        +2     
===========================================
+ Hits        10088    15689     +5601     
+ Misses      10776     4898     -5878     
- Partials      491      778      +287     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@wild-endeavor wild-endeavor changed the title register_launch_plan shouldn't try to register the underlying workflo… Prevent register_launch_plan from re-registering already registered workflow Jan 11, 2025
@wild-endeavor wild-endeavor marked this pull request as ready for review January 11, 2025 01:46
@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 11, 2025

Code Review Agent Run #345323

Actionable Suggestions - 1
  • flytekit/remote/remote.py - 1
Additional Suggestions - 1
  • flytekit/remote/remote.py - 1
Review Details
  • Files reviewed - 2 · Commit Range: 2641b3e..84b90fc
    • flytekit/remote/remote.py
    • tests/flytekit/unit/remote/test_remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Optimize Launch Plan Registration Process

remote.py - Added workflow existence check before registration and improved launch plan handling

test_remote.py - Added test cases for optimized launch plan registration

Comment on lines +1290 to +1305
if remote_wf is None:
ident = run_sync(
self._serialize_and_register,
entity,
serialization_settings,
version,
options,
False,
)
else:
launch_plan_model = get_serializable(
OrderedDict(), settings=serialization_settings, entity=entity, options=options
)
ident = self.raw_register(
launch_plan_model, serialization_settings, version, create_default_launchplan=False
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting workflow registration logic

Consider extracting the workflow existence check and registration logic into a separate helper method to improve code organization and reusability. The current implementation has duplicate logic for handling workflow registration.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if remote_wf is None:
ident = run_sync(
self._serialize_and_register,
entity,
serialization_settings,
version,
options,
False,
)
else:
launch_plan_model = get_serializable(
OrderedDict(), settings=serialization_settings, entity=entity, options=options
)
ident = self.raw_register(
launch_plan_model, serialization_settings, version, create_default_launchplan=False
)
ident = self._register_workflow_if_needed(remote_wf, entity, serialization_settings, version, options)

Code Review Run #345323


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants