S3, Add runImportsOnWorkers parameter to execute imports from Worker #5025

Guigzai · 2024-07-19T13:48:40Z

Hello,

TOIL Version 6.1
Python 3.9

We use S3 URIs as inputs to our workflows.

Our platform drives the workflows from a VM that mounts the shared space with NFS.

The leader toil downloads the files, which makes this task very slow because it's not performed on the compute infrastructure/worker, which is tuned for high network performance.

Is it possible to implement a --runImportsOnWorkers parameter such as --runLocalJobsOnWorkers to make S3 copies from processing resources?

Thanks

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1619

unito-bot · 2024-07-23T17:39:59Z

➤ Adam Novak commented:

Do we need a way to run a workflow with mixed inputs, where some inputs are local file paths only available on the leader filesystem while others are URLs we can fetch from the workers?

I guess we could in that case fetch all local files from the leader and everything else from the worker.

Guigzai · 2024-07-29T12:46:30Z

No, all the files are stored in Shared GPFS POSIX or S3 available from the worker.
So all files are available form both, leader and worker.

stxue1 mentioned this issue Sep 19, 2024

Allow importing on workers #5098

Merged

19 tasks

adamnovak linked a pull request Sep 24, 2024 that will close this issue

Allow importing on workers #5098

Merged

19 tasks

stxue1 mentioned this issue Sep 26, 2024

Support importing on workers in WDL #5103

Merged

19 tasks

adamnovak closed this as completed in #5098 Sep 26, 2024

adamnovak mentioned this issue Oct 2, 2024

Add import on workers for WDL #5113

Closed

stxue1 mentioned this issue Oct 3, 2024

Add filesize sniffing to specify the import job's disk space for WDL and CWL #5114

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3, Add runImportsOnWorkers parameter to execute imports from Worker #5025

S3, Add runImportsOnWorkers parameter to execute imports from Worker #5025

Guigzai commented Jul 19, 2024 •

edited by unito-bot

Loading

unito-bot commented Jul 23, 2024

Guigzai commented Jul 29, 2024

S3, Add runImportsOnWorkers parameter to execute imports from Worker #5025

S3, Add runImportsOnWorkers parameter to execute imports from Worker #5025

Comments

Guigzai commented Jul 19, 2024 • edited by unito-bot Loading

unito-bot commented Jul 23, 2024

Guigzai commented Jul 29, 2024

Guigzai commented Jul 19, 2024 •

edited by unito-bot

Loading