Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debugging dashboard url to workload creation output #48

Merged
merged 4 commits into from
Dec 14, 2023

Conversation

SurbhiJainUSC
Copy link
Collaborator

Fixes / Features

  • This PR outputs the link of the debugging dashboard deployed in Google Cloud project using Terraform resource in https://github.com/google/cloud-tpu-monitoring-debugging. The debugging dashboard will display the stack traces collected for the workload using cloud-tpu-diagnostics package.
  • Refactor the code for both outlier and debugging dashboard

Testing / Documentation

Testing details.

  • [ y ] Tests pass
  • [ n ] Appropriate changes to documentation are included in the PR

xpk.py Show resolved Hide resolved
xpk.py Outdated

Args:
args: user provided arguments for running the command.

Returns:
str:
identifier of outlier dashbord if deployed in project,
identifier of dashbord if deployed in project,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit dashbord -> dashboard

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be misspelled in the other function descriptions as well

xpk.py Show resolved Hide resolved
xpk.py Outdated
None otherwise.
"""
outlier_dashboard_filter = "displayName:'GKE - TPU Monitoring Dashboard'"
dashboard_id = get_gke_dashboard(args, outlier_dashboard_filter)
Copy link
Collaborator

@Obliviour Obliviour Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!

I see that we check the dashboard_id in two ways before erroring:

  1. if it is None which behind the scenes means an error in the command
  2. If not dashboard_id which means the dashboard id is empty and there are no dashboards available.

I think we can make this easier to read by having get_gke_dashboard return a tuple with (return_error, return_val) syntax? Otherwise if dashboard_id is None and if not dashboard_id are hard to tell what they are checking. Like for example if not dashboard_id would be True if dashboard_id is None. What do you think about this?

xpk.py Outdated
args: user provided arguments for running the command.

Returns:
bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think get_gke_outlier_dashboard only returns a str

@SurbhiJainUSC SurbhiJainUSC merged commit 64de81e into main Dec 14, 2023
3 checks passed
@SurbhiJainUSC SurbhiJainUSC deleted the debugging-dashboard branch December 14, 2023 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants