Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis level metadata #30

Open
rvosa opened this issue Aug 31, 2024 · 0 comments
Open

Analysis level metadata #30

rvosa opened this issue Aug 31, 2024 · 0 comments

Comments

@rvosa
Copy link
Member

rvosa commented Aug 31, 2024

For downstream analysis of the performance of different pipelines and their parameters the uploads should include a YAML file with those parameters. The contents of the YAML file will then be attached to the result table. For example, YAML contents like this:

name: value

Would be joined with the result table such that there will be a column called name whose values will be value for all cells. The plan is that these would be factors (in the statistical sense) so that we can see whether different values for name have different results. To make this work, it is therefore key to decide on a small vocabulary for these terms and to look for overlap among the pipelines in some of their parameters. The simplest one would be pipeline: MGE versus pipeline: skim2mito. We can then merge the tables produced from the different pipelines and see, for example, whether MGE on average has more/fewer ambiguities than skim2mito.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant