Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI Group Delete of Task Executions causes deadlock #1935

Closed
cppwfs opened this issue Jul 27, 2023 · 0 comments · Fixed by #1936
Closed

UI Group Delete of Task Executions causes deadlock #1935

cppwfs opened this issue Jul 27, 2023 · 0 comments · Fixed by #1936
Assignees
Labels
type/bug Is a bug report
Milestone

Comments

@cppwfs
Copy link
Contributor

cppwfs commented Jul 27, 2023

Issue:
The user attempts a group delete of task executions and the server responds with a 500 error stating that a deadlock has occurred.
Cause:
A CTR definition that contains both boot3 and boot2 applications has been launched multiple times and has been restarted due to failures by a child task. When the user attempts to do a group delete of the task executions, it passes in a list of task executions to be removed without the associated schema type. This causes deadlock issues when task execution and job execution data is being deleted.

Resolution:
Currently, the code https://github.com/spring-cloud/spring-cloud-dataflow-ui/blob/main/ui/src/app/shared/api/task.service.ts#L76-L78 Deletes all ids in one request.
The solution to the issue is change the above implementation to one that does the following:

  1. Create a list of the child task executions(task executions with a parent id)
    a. Group the child task execution ids by schemaTarget and invoke destroy with multiple ids for the schemaTarget
  2. Create a list of the parent task executions(task executions without a parent id)
    a. Group by schemaTarget collect ids and invoke destroy with multiple ids for schemaTarget

Steps to reproduce:

  1. Creates a composed task runner like taskBoot2 && taskBoot3Fail && taskBoot3 as foo. (TaskBoot3Fail should fail)
  2. Launch Foo
  3. Once foo completes in a state of failed. Go to the job execution page and restart.
  4. Update taskBoot3Fail so that it will no longer fail. And compile to create a new binary or image
  5. Go to the job execution page and restart
  6. Upon its successful completion
  7. Go to Task Execution Page and do a group delete all task executions.

If you have any questions feel free to reach out to me or @corneil

@cppwfs cppwfs added the type/bug Is a bug report label Jul 27, 2023
@claudiahub claudiahub self-assigned this Jul 28, 2023
@markpollack markpollack moved this to In Progress in SCDF 2.11 RC1 Aug 1, 2023
@markpollack markpollack moved this from In Progress to Done in SCDF 2.11 RC1 Aug 3, 2023
@onobc onobc added this to the 3.4.0-RC1 milestone Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Is a bug report
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants