Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collating results is slow for large datasets (>1500 genomes) #14

Open
widdowquinn opened this issue Nov 9, 2015 · 1 comment
Open
Assignees
Labels
enhancement something we'd like pyani to do that it doesn't already performance the issue relates to making pyani more efficient
Milestone

Comments

@widdowquinn
Copy link
Owner

Currently, the code writes out all results individually and leaves processing output for calculation of ANI etc. until the end. This leaves an uninformative, and long, lag time before the results are presented to the user.

It may be possible to collate/summarise intermediate results in file, as we go. The total analysis time will be no shorter, but it might avoid that 'dead time' after the alignments are done.

@widdowquinn widdowquinn added the enhancement something we'd like pyani to do that it doesn't already label Nov 9, 2015
@widdowquinn widdowquinn self-assigned this Nov 9, 2015
@widdowquinn
Copy link
Owner Author

This could be implemented as cached matrix and/or dataframe results in the pyani database, with one table/matrix type for each run. Then, when pulling down the complete dataset for a run, we need only make one SQL request, rather than one for each result.

@widdowquinn widdowquinn added this to the 0.3.1 milestone May 29, 2020
@widdowquinn widdowquinn added the performance the issue relates to making pyani more efficient label May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement something we'd like pyani to do that it doesn't already performance the issue relates to making pyani more efficient
Projects
None yet
Development

No branches or pull requests

1 participant