Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add a generic slurm cluster submit option #283

Merged
merged 14 commits into from
May 2, 2024

Conversation

jaamarks
Copy link
Collaborator

@jaamarks jaamarks commented May 1, 2024

This PR adds support for submitting jobs to a Slurm cluster using the cgr submit --slurm command. It leverages a generic Slurm cluster template provided within the codebase. It also also:

  • Updates relevant docs
  • Updates unit tests to cover the added functionality
  • Addresses a few minor typos throughout the codebase.

Fixes #268

@kliao12
Copy link
Contributor

kliao12 commented May 1, 2024

LGTM

jaamarks added 14 commits May 1, 2024 22:01
- Incorrect `timedelta` handling caused Slurm job submission failures
  for workflows exceeding 24 hours. Python's `timedelta` format
  (e.g., "1 day, 1:00:00") was incompatible with Slurm.
- Introduced `formatted_time` to convert `timedelta` objects to Slurm's
  required format (https://slurm.schedmd.com/sbatch.html#OPT_time).
- This fix resolves errors caused by the incompatible time format in
  Slurm job submissions.
- Applies the same time format fix for Slurm compatibility as
  commit af39a8d
Users specify the name of the Slurm parition (queue) within the
config.yml file in the directory they submit from. The queue name
provided will be the slurm parition that all non-local jobs are
submitted to.
- Requires users of `--slurm` to specify the slurm partition in their
  config.yml file.
- Removed the option to specify the slurm partition via the `--queue`
  when directly submitting with `--slurm`.
- Renamed `--slurm-generic` flag to the simpler `--slurm`.
This commit improves the error message displayed when a user runs the
`--slurm` option without specifying the required `slurm_partition`
key-value pair in their `config.yml` file.

The changes include:

* Clearer error message highlighting the missing key-value pair and its purpose.
* Color-coded output (if terminal supports) for better visual distinction
  between error and solution.

These improvements make it easier for users to understand the error and
take corrective action.
Users can now specify the desired Slurm partition in their config.yml
when building with `cgr config --slurm-partition <partition_name>`.
This partition is used for jobs submitted with `cgr submit --slurm`.
Add RTI International contributors to the author list.
    - Add documentation about the `--slurm-partition` option for the
      `cgr config` command.
    - Add documentation about the `--slurm` option for the `cgr submit`
      command.
    - Fix typo in a warning message about sample sheet requirement
      in `cgr_gwas_qc/cli/config.py`
      `docs/getting_started/running_pipeline.rst` file.
    - Fix typo in a message to Biowulf users in the
      `docs/getting_started/running_pipeline.rst` file.
In commit bd74c72 we increased the
`ibd_pi_hat_min` from 0.05 to 0.12 to focus on closer relatives.
This commit aligns our documentation with that new default.
We have the option to submit to a generic slurm cluster profile now
with `cgr submit --slurm`, so this commit updates the tests to
incorporate this new feature.
@jaamarks jaamarks merged commit 0591e84 into default May 2, 2024
2 checks passed
@jaamarks jaamarks deleted the issue-268-generic-slurm branch May 21, 2024 20:25
@jaamarks
Copy link
Collaborator Author

jaamarks commented Jun 5, 2024

Fixes #230

@jaamarks jaamarks linked an issue Jun 5, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test and debug the generic slurm cluster submit option Create installation documentation for Slurm cluster
2 participants