Roadmap

Maintainance:

Implementation Time estimates are in engineering days and very back-of-the-envelope.

🎉DONE More efficient kubernetes support
- One of:
  - Host-path caching
  - Toil-integrated within-pod scheduler
- Time estimate: not scoped yet
🎉DONE More robust Kubernetes support
- Handle Kubernetes communication timeouts without restarting the leader
- Time estimate: not scoped yet
🎉DONE Rework SingleMachineBarchSystem to eliminate thread limit exhaustion issues; Reduce thread usage in singleMachine mode
- Time estimate: not scoped yet
🚧WIP CWL 1.1+ Support
- 13 failing CWL 1.1 conformance tests currently
- Conditional support (CWL 1.2)
- Time estimate: not scoped yet
Move away from mesos before/as Ubuntu 16.04 goes out of support
- probably in favor of auto-deployed Kubernetes somehow
- Time estimate: not scoped yet
Running in-house VG WDL workflows
- Time estimate: not scoped yet
WDL compliance test suite
- Time estimate: not scoped yet
Increase test coverage
- Time estimate: not scoped yet
Automatic idle worker termination and fixes to ignored nodes.
- Time estimate: not scoped yet
Updates on caching (should we enable by default?).
- Time estimate: not scoped yet
Incorporate a cactus integration test to better support cactus
- Time estimate: not scoped yet
📋TODO Improved ease of debugging
- This is more of an ongoing task than clearly defined project.
- Adjust log levels based on experience.
- Time estimate: not scoped yet
More/better HPC Queueing System Support
- Time estimate: not scoped yet

Google Job Store Support
- Time estimate: not scoped yet
Update boto libraries to boto3
- Time estimate: not scoped yet
Move from simpleDB to a better supported service
- Time estimate: not scoped yet
More scalable Kubernetes support
- moving to watches
- handling more pods in queue than we can loop over before our continue tokens expire
- Time estimate: not scoped yet
Restart/recovery improvements
- Changing CWL parameters
- Managing a failed task that cannot be recovered in a large pipeline
- Checkpointing
- Time estimate: not scoped yet
Better support for heterogeneous tasks (e.g. customizing disk size per instance type, (maybe) FPGA support for DRAGEN).
- Time estimate: not scoped yet
AWS custom/multi security group support
- Time estimate: not scoped yet
AWS multi-zone support
- Time estimate: not scoped yet