-
-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Ersilia test command #1224
Conversation
* WIP Add session management utils * Create a session at ersilia cli initialization * Remove unused code * start an ersilia session in a dedicated sessions dir that is mapped to the parent process' id which ran the given ersilia command, this will generally be a shell process inside a terminal, but it can also be a process from a bash script * declare session specific defaults * Run all ersilia commands within a single process during standard example run, otherwise ersilia run command does not find a served model bec of running in a different process and therefore in a different session * Move the currently served model's pid to its dedicated session directory * WIP Logging * Redirect tmp logs to model's session logs * catch permission error * Redirect tmp logs to model's session logs * don't use ersilia exception management because we don't exactly want the ersilia process to exit
Updated PR @miquelduranfrigola |
Hi @kurysauce - I've gone through your refactor of the test command and I think this is very much headed in the right direction. I have a small concern - I am unsure if we are accounting for output types within the check for consistent outputs - the code seems to be written only for dealing with numeric output types, as I don't see any check on the output type anywhere. Is that indeed the case? |
@kurysauce I've another request for you for this PR. The previous contributors left a lot of print statements within the code, which is generally not a good practice. Would it be possible for you to replace them with debug and info logs? :) |
@DhanshreeA here is the link to the completed Google Doc. Additionally, I replaced the print statements with debug and logging. I don't think I see anything else on my end to update on the test script for now. |
@DhanshreeA I believe the test commands tests |
Generally speaking, there should be no print statements and everything that needs to be logged onto the terminal should essentially be done through a logger and not through print. :) |
Yes indeed, generally speaking we want to log information instead of printing. Note that logging is activated only with the In any case, there are plenty of examples within the Ersilia codebase as to how we should be logging messages. Usually, they are accessible via the I hope this helps! |
Thank you for taking your time to contribute to Ersilia, just a few checks before we proceed
Description
Refactor the Ersilia test command to run in 100% completion for correctly incorporated models. Currently stops prematurely at the
check_example_input
test method.Changes to be made
Ensure correct
return
values and flags are appropriately passed to methods inModel Tester
class. Debug other methods as test command runs closer to completion.EDIT: Completed
Status
run_bash
methodget_directories_sizes()
method and associated helper functions.check_consistent_output
method by calculating Spearman's Correlation & Mean Relative Absolute Value thresholds, replacing 5% difference threshold due to false positive failed test casesread_csv
method to handle differences in column parsing with Ersilia output and Bash outputTo do
Must test updated command on multiple models to ensure robustness. Ensure diversified outputs are tested. Test command may reveal hidden flaws from previous model incorporations (non-existent run.sh files, inconsistent outputs comparisons, execution errors in run.sh files that reveal problems in model's
main.py
script, etc...), open issues for specific models as needed. Seeking help from community members to test models!Models must be fetched using
--from_github
or--from_s3
flag, or else test script will fail.Is this pull request related to any open issue? If yes, replace issueID below with the issue ID
Related to #1203 , #1222, #1221