Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kedro CLI startup time made shorter #1196

Merged
merged 1 commit into from
Dec 16, 2022
Merged

Conversation

znfgnu
Copy link
Contributor

@znfgnu znfgnu commented Dec 7, 2022

  • server.py: Moved DEFAULT_HOST and DEFAULT_PORT to constants.py
  • launchers/cli.py: Moved from kedro_viz.server ... statement to viz function

Signed-off-by: Konrad Sikorski znfgnu@gmail.com

Description

Some kedro-viz users experience very long startup time of kedro CLI script right after installing kedro-viz.
On my computer (Intel® Core™ i7-5600U CPU @ 2.60GHz × 4 w/ some SSD drive), running kedro command took:

  • without kedro-viz installed: 0,61-0,75s (10 tries)
  • with kedro-viz installed: 2,39-3,76s (10 tries)

On slower machines the problem is even more disturbing - I experienced almost ~10s delay each kedro execution.
Every time one runs the pipeline via kedro CLI, kedro_viz and its dependencies are imported, even if they're not used.

Time measurements
$ for i in {1..10}; do time kedro >/dev/null; done
kedro > /dev/null  0,68s user 0,09s system 99% cpu 0,771 total
kedro > /dev/null  0,65s user 0,06s system 99% cpu 0,713 total
kedro > /dev/null  0,68s user 0,07s system 99% cpu 0,753 total
kedro > /dev/null  0,67s user 0,07s system 99% cpu 0,739 total
kedro > /dev/null  0,67s user 0,04s system 99% cpu 0,717 total
kedro > /dev/null  0,71s user 0,08s system 99% cpu 0,797 total
kedro > /dev/null  0,64s user 0,06s system 99% cpu 0,706 total
kedro > /dev/null  0,61s user 0,10s system 99% cpu 0,710 total
kedro > /dev/null  0,65s user 0,05s system 99% cpu 0,697 total
kedro > /dev/null  0,75s user 0,06s system 99% cpu 0,803 total
$ yes | pip install kedro-viz >/dev/null       
$ for i in {1..10}; do time kedro >/dev/null; done
kedro > /dev/null  3,76s user 0,42s system 98% cpu 4,236 total
kedro > /dev/null  3,60s user 0,36s system 97% cpu 4,072 total
kedro > /dev/null  2,64s user 0,37s system 109% cpu 2,741 total
kedro > /dev/null  2,44s user 0,37s system 110% cpu 2,534 total
kedro > /dev/null  2,48s user 0,34s system 110% cpu 2,540 total
kedro > /dev/null  2,42s user 0,36s system 111% cpu 2,506 total
kedro > /dev/null  2,39s user 0,37s system 110% cpu 2,494 total
kedro > /dev/null  2,44s user 0,33s system 111% cpu 2,495 total
kedro > /dev/null  2,54s user 0,32s system 110% cpu 2,581 total
kedro > /dev/null  2,72s user 0,37s system 108% cpu 2,839 total

After small changes I made it's 0,67-0,79s, preserving usability.

Time measurements after rearranging imports
$ yes | pip uninstall kedro-viz >/dev/null
$ pip install -e ./package/ >/dev/null   
$ for i in {1..10}; do time kedro >/dev/null; done
kedro > /dev/null  0,79s user 0,04s system 99% cpu 0,834 total
kedro > /dev/null  0,70s user 0,06s system 99% cpu 0,760 total
kedro > /dev/null  0,71s user 0,04s system 99% cpu 0,752 total
kedro > /dev/null  0,68s user 0,07s system 99% cpu 0,755 total
kedro > /dev/null  0,69s user 0,06s system 99% cpu 0,755 total
kedro > /dev/null  0,67s user 0,08s system 99% cpu 0,757 total
kedro > /dev/null  0,72s user 0,05s system 99% cpu 0,770 total
kedro > /dev/null  0,70s user 0,06s system 99% cpu 0,756 total
kedro > /dev/null  0,70s user 0,07s system 99% cpu 0,768 total
kedro > /dev/null  0,68s user 0,06s system 99% cpu 0,746 total

The problem was also noted in kedro-org/kedro#1476

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

@noklam
Copy link
Contributor

noklam commented Dec 12, 2022

I see the unit tests are failing, would you be able to have a look at it?

@znfgnu
Copy link
Contributor Author

znfgnu commented Dec 12, 2022

Sure, I'll take a look at it in the evening. Unit tests mostly fail because of broken mocks - kedro_viz.launchers.cli.run_server is patched instead of kedro_viz.server.run_server. When I tried to run tests locally some other tests failed as well (but didn't fail in CI) but I didn't have enough time to fix it.

What about linter rules? Should I do something about them? Local import is needed for this kind of behavior but linter reports it as an error.

@noklam
Copy link
Contributor

noklam commented Dec 12, 2022

@znfgnu Thanks! For linter I think in this case it's reasonable to disable them. You can add inline comment to avoid the checking. For example if it is Pylint then you will do something like # pylint: disable=xxx, you can look up the syntax from the repository.

@znfgnu
Copy link
Contributor Author

znfgnu commented Dec 13, 2022

@noklam I think it's ready to review, tests are passing

def viz(host, port, browser, load_file, save_file, pipeline, env, autoreload, params):
"""Visualise a Kedro pipeline using Kedro viz."""
from kedro_viz import server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to import server module instead of
from kedro_viz.server import run_server?

Copy link
Contributor Author

@znfgnu znfgnu Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to fulfill requirements of linter: there are two functions imported from this module: run_server and is_localhost. In case of importing them separately, linter tests report error that there are more than 15 local variables (16). Importing whole server fulfills this requirement and doesn't require to bypass checks.

I'm pretty convinced that in case of performance it doesn't make a big difference. In case of code style - I leave this decision to maintainers. My the only one intent was to pass linter tests without bypassing them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this increases the number of variables, as I understand the changes are moved 2 of them to kedro.constants and moving some of the import from the top level to the function lazy imports. So the total number of local imports should be unchanged.

I would prefer to keep the top-level module and suppress the listings in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this increases the number of variables, as I understand the changes are moved 2 of them to kedro.constants and moving some of the import from the top level to the function lazy imports. So the total number of local imports should be unchanged.

When you import names inside the function, they are being put in locals() dictionary, as opposed to top-level imports when names are being put into globals().

Number of local imports differ between from kedro_viz.server import is_localhost, run_server and from kedro_viz import server because in the first case you import two names (is_localhost and run_server) and in the second case only one (server) is imported.

I would prefer to keep the top-level module and suppress the listings in this case.

What exactly do you mean by keeping the top-level module?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining! I thought it is complaining about the imported objects instead of the function locals.

I mean this keeping this from kedro_viz.server import run_server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this in the evening

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a nice small change as I often need to uninstall viz for development purposes! I dropped a small comment but it works nicely, I tested it on my Windows machine as well.

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution ⭐️ @znfgnu

Copy link
Member

@tynandebold tynandebold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this @znfgnu!

Can you please add a line to the release notes file?

@znfgnu
Copy link
Contributor Author

znfgnu commented Dec 15, 2022

Thanks so much for this @znfgnu!

No problem, it was fun to find and fix this bottleneck :) As I started using kedro recently, this problem used to affect my work too ;)

Can you please add a line to the release notes file?

Sure, I'll do it in the evening.

- server.py: Moved `DEFAULT_HOST` and `DEFAULT_PORT` to `constants.py`
- launchers/cli.py: Moved `from kedro_viz.server ...` statement to
  viz function
- modified tests so they follow new code structure

Signed-off-by: Konrad Sikorski <znfgnu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants