Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unparallel Runner #65

Merged
merged 7 commits into from
Oct 15, 2024
Merged

Conversation

jpn--
Copy link
Member

@jpn-- jpn-- commented Oct 15, 2024

This pull request introduces several enhancements and new features to the sharrow project, focusing on improving the workflow and adding new functionalities. The most significant changes include updates to the GitHub Actions workflow, new templates for array operations, and additional parameters for parallel computation.

Workflow Enhancements

  • .github/workflows/run-tests.yml: Added a new job activitysim-examples to test updates to sharrow against ActivitySim examples. This job includes steps for setting up the environment, caching dependencies, and running tests.

New Templates and Functions

  • sharrow/flows.py: Introduced COLUMN_FILLER_TEMPLATE and ARRAY_MAKER_TEMPLATE for generating functions that fill arrays and create arrays, respectively. These templates support different dimensions and improve code modularity. The array_maker approach avoids jit-compiling large functions in favor of compiling multiple smaller functions; some modest loss in compiler optimizations is sacrificed for potentially very large improvements in compile times. [1] [2]

Parallel Computation Enhancements

  • sharrow/flows.py: Added parameters parallel_irunner and parallel_idotter to control parallel computation in various functions. The default value for parallel_irunner is now set to False. The runner is the simplest sharrow function, which simply assembles the flow into an array of data. Parallelism on this function provides typically only modest benefits unless data structures are very large; but compile time for parallelism has been observed to increase worse-than-linearly with larger specifications (i.e. doubling the number of spec rows results in much more than double the compile time when parallelism is active). Parallel computation is often disabled anyhow by ActivitySim users, in which case the increased compile time provides no benefit at all. Updated function templates to use these parameters, allowing more granular control over parallel execution. [1] [2] [3]

Array Maker Integration

  • sharrow/flows.py: Added the option to use the new array_maker functions in load, load_dataframe, and load_dataarray methods. This allows for more flexible and efficient array creation. [1] [2] [3] [4]

Code Refactoring

  • sharrow/flows.py: Refactored various parts of the code to integrate the new templates and improve readability. This includes changes to initialization functions and the addition of new methods for filling and creating arrays. [1] [2] [3]

@jpn-- jpn-- merged commit 86bd7e3 into ActivitySim:main Oct 15, 2024
15 checks passed
@jpn-- jpn-- deleted the unparallel-runner branch October 15, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant