-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Looper's integration of Pipestat #492
Comments
Here is Lines 505 to 599 in 3899672
|
Looper creates pipestat managers using: Lines 450 to 469 in 3899672
This func creates PipestatManagerObjects based on the configuration made within |
After today's discussion:
|
Question:What about when more than one pipeline interface is specified? Create a pipestat instance for each and thus each would have their own config file ala: pipeline_interfaces:
sample:
- ../pipeline/pipeline_interface1_sample.yaml
- ../pipeline/pipeline_interface2_sample.yaml
project:
- ../pipeline/pipeline_interface1_project.yaml
- ../pipeline/pipeline_interface2_project.yaml |
That probably makes sense. the alternative would be, you could maybe just update the pipeline interface of an existing PipestatManager object But maybe it makes sense to attach the PipestatManager to the pipeline interface object, so there would be one per pipeline interface. |
Yeah, I'm leaning towards this and will investigate this approach. |
Another question/issue I've run into this afternoon: What's the priority for sourcing the pipeline name? Pipestat looks at the one supplied in the output_schema first. However, the pipeline_interface (either sample or project) used with Looper also has a pipeline_name and points to a pipestat output schema. These pipeline_names could be different if the user wishes to use one output_schema for two different interfaces. I just modified the order in which Pipestat can determine the pipeline_name:
So user_supplied -> config_supplied -> schema_supplied else use the default. |
Follow up issue: |
Some notes for next Steps:
The above will be fixes for 1.9.0 that should work for the majority of cases until we do more work to the system for 2.0 release. |
For Looper 2.0, we will consolidate sample and project level pipeline interfaces under a single interface. This will break backwards compatibility. pipeline_name: example_pipestat_pipeline
output_schema: pipestat_output_schema.yaml
sample_interface:
command_template: >
python {looper.piface_dir}/count_lines.py {sample.file} {sample.sample_name} {pipestat.results_file}
project_interface:
command_template: >
python {looper.piface_dir}/count_lines_project.py {sample.file} {pipestat.results_file} And the output schema pipeline name must match with the pipeline interface pipeline_name. We will enforce this by raising an exception of they do not match. |
For sample vs project level commands, see this related issue: #360 |
Ok, the refactoring is nearly complete and I've also added the |
Currently, Looper checks if Pipestat has been configured for each sample before adding the sample to the submission conductor.
If pipestat can be successfully configured, looper generates a configuration file to be used by pipestat called
looper_pipestat_config.yaml
which looks something like this:Currently, user adds pipestat field to
.looper.yaml
file with relevant info:after setting everything up, looper creates a pipestat config file which can be used by the pipeline author to configure pipestat by passing that along to a pipestat instance within a pipeline:
For example: the pipeline interface author (pipeline author) can pass these values to the pipeline:
pipeline_interface (for a pipeline.py):
pipeline_interface (for a shell pipeline):
How looper checks for pipestat configuration:
looper/looper/project.py
Lines 336 to 352 in 3899672
The main functions for this are
_check_if_pipestat_configured
and_get_pipestat_configuration
.Code moves through
_check_if_pipestat_configured
first and will return True or False. If there is any exception raised during the next step for either a single sample or a project, it will return False.looper/looper/project.py
Lines 471 to 503 in 3899672
If this function returns False, looper continues, assuming the user does not wish to use pipestat.
Related Issues:
#411
#413
#425
#459
#471
The text was updated successfully, but these errors were encountered: