Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Push ArrayGet instructions backwards through IfElse instructions to avoid expensive array merges #5570

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

asterite
Copy link
Collaborator

@asterite asterite commented Jul 19, 2024

Description

Problem

Resolves #5501

Summary

Implements #5501

Additional Context

None.

Documentation

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@@ -84,6 +84,7 @@ pub(crate) fn optimize_into_acir(
// This pass must come immediately following `mem2reg` as the succeeding passes
// may create an SSA which inlining fails to handle.
.run_pass(Ssa::inline_functions_with_no_predicates, "After Inlining:")
.run_pass(Ssa::array_get_optimization, "After Array Get Optimizations:")
Copy link
Collaborator Author

@asterite asterite Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is temporary, we should find a name that better describes this pass.

Comment on lines 27 to 30
// This should match the check in flatten_cfg
if let crate::ssa::ir::function::RuntimeType::Brillig = function.runtime() {
continue;
}
Copy link
Collaborator Author

@asterite asterite Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from other passes: I'm not sure what it means though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just means that we only want to apply this SSA pass to functions which are being compiled to constrained ACIR rather than unconstrained Brillig. We do this as some optimisations are specific to each runtime.

This optimisation would benefit both runtimes however so we should remove this.

Copy link
Member

@TomAFrench TomAFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick weekend phone review.

compiler/noirc_evaluator/src/ssa/opt/array_get.rs Outdated Show resolved Hide resolved
compiler/noirc_evaluator/src/ssa/opt/array_get.rs Outdated Show resolved Hide resolved
compiler/noirc_evaluator/src/ssa/opt/array_get.rs Outdated Show resolved Hide resolved
Comment on lines 27 to 30
// This should match the check in flatten_cfg
if let crate::ssa::ir::function::RuntimeType::Brillig = function.runtime() {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just means that we only want to apply this SSA pass to functions which are being compiled to constrained ACIR rather than unconstrained Brillig. We do this as some optimisations are specific to each runtime.

This optimisation would benefit both runtimes however so we should remove this.

Copy link
Contributor

github-actions bot commented Jul 22, 2024

Changes to circuit sizes

Generated at commit: cb28e016cff69b322802f4f8be4b23ecd1ffba4d, compared to commit: 8bb3908281d531160db7d7898c67fb2647792e6e

🧾 Summary (10% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_5252 +321 ❌ +0.98% +1,495 ❌ +3.36%
sha256_regression +2,773 ❌ +6.55% +4,027 ❌ +1.94%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_5252 33,237 (+321) +0.98% 46,014 (+1,495) +3.36%
sha256_regression 45,136 (+2,773) +6.55% 211,827 (+4,027) +1.94%
sha256_var_size_regression 22,242 (+625) +2.89% 80,273 (+929) +1.17%
sha256_var_witness_const_regression 1,723 (+21) +1.23% 17,319 (+88) +0.51%
sha256 2,085 (+21) +1.02% 25,426 (+88) +0.35%

@asterite asterite marked this pull request as ready for review July 22, 2024 15:00
Comment on lines +77 to +78
// Only if the array isn't of a tuple type (or a composite type)
if element_types.len() != 1 {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to handle this case. Without this, insert_instruction_and_results below returns something that has multiple results and I don't know how to move that on to the next instruction. But maybe this optimization wasn't intended for composite types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to perform this on composite types as well but we can worry about that in a follow-up. Before merging this PR we should make an issue to add this though.

@asterite
Copy link
Collaborator Author

I don't get that gate diff on my machine. On master:

+-----------------+----------+----------------------+--------------+
| Package         | Function | Expression Width     | ACIR Opcodes |
+-----------------+----------+----------------------+--------------+
| regression_5252 | main     | Bounded { width: 4 } | 81786        |
+-----------------+----------+----------------------+--------------+

In this PR:

+-----------------+----------+----------------------+--------------+
| Package         | Function | Expression Width     | ACIR Opcodes |
+-----------------+----------+----------------------+--------------+
| regression_5252 | main     | Bounded { width: 4 } | 81786        |
+-----------------+----------+----------------------+--------------+

//
// and a later ArrayGet instruction is this:
//
// v11 = array_get v4, index v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// v11 = array_get v4, index v4
// v11 = array_get v10, index v4

we mean v10 here?

};

// Don't optimize if the index is a constant (this is optimized later on in a different way)
if let Value::NumericConstant { .. } = &dfg[dfg.resolve(*index)] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Value::NumericConstant { .. } = &dfg[dfg.resolve(*index)] {
if dfg.is_constant(*index) {

Comment on lines +77 to +78
// Only if the array isn't of a tuple type (or a composite type)
if element_types.len() != 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to perform this on composite types as well but we can worry about that in a follow-up. Before merging this PR we should make an issue to add this though.

//
// and the ArrayGet instruction is this:
//
// v11 = array_get v4, index v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// v11 = array_get v4, index v4
// v11 = array_get v10, index v4

?

@vezenovm
Copy link
Contributor

vezenovm commented Jul 22, 2024

In this PR:

I would make sure to run with nargo info --force as we cache build artifacts.
I got the following which is in line with the diff action:

+-----------------+----------+----------------------+--------------+
| Package         | Function | Expression Width     | ACIR Opcodes |
+-----------------+----------+----------------------+--------------+
| regression_5252 | main     | Bounded { width: 4 } | 82053        |
+-----------------+----------+----------------------+--------------+

@TomAFrench
Copy link
Member

This is just blocked on the regression_5252 regression, right?

@jfecher
Copy link
Contributor

jfecher commented Aug 20, 2024

@TomAFrench I'd say its also blocked on not showing any other circuit improvements

@TomAFrench
Copy link
Member

True.

@TomAFrench
Copy link
Member

I've pushed an example program which drops from 305 opcodes to 9 with this PR

TomAFrench and others added 2 commits September 30, 2024 22:19
* master: (313 commits)
  chore: Do not print entire functions when running debug trace (#6814)
  chore(ci): Active rollup circuits in compilation report (#6813)
  feat(ssa): Bring back tracking of RC instructions during DIE (#6783)
  feat: add `nargo test --format json` (#6796)
  chore: Change Id to use a u32 (#6807)
  feat(ssa): Hoist MakeArray instructions during loop invariant code motion  (#6782)
  feat: add `(x | 1)` optimization for booleans (#6795)
  feat: `nargo test -q` (or `nargo test --format terse`) (#6776)
  fix: disable failure persistance in nargo test fuzzing (#6777)
  feat(cli): Verify `return` against ABI and `Prover.toml` (#6765)
  chore(ssa): Activate loop invariant code motion on ACIR functions (#6785)
  fix: use extension in docs link so it also works on GitHub (#6787)
  fix: optimizer to keep track of changing opcode locations (#6781)
  fix: Minimal change to avoid reverting entire PR #6685 (#6778)
  feat: several `nargo test` improvements (#6728)
  chore: Try replace callstack with a linked list (#6747)
  chore: Use `NumericType` not `Type` for casts and numeric constants (#6769)
  chore(ci): Extend compiler memory report to external repos (#6768)
  chore(ci): Handle external libraries in compilation timing report (#6750)
  feat(ssa): Implement missing brillig constraints SSA check (#6658)
  ...
Copy link
Contributor

Peak Memory Sample

Program Peak Memory
keccak256 78.71M
workspace 122.04M
regression_4709 286.66M
ram_blowup_regression 1.62G
private-kernel-tail 209.11M
private-kernel-reset 848.11M
private-kernel-inner 304.33M
parity-root 174.64M

Copy link
Contributor

github-actions bot commented Dec 14, 2024

Compilation Report

Program Compilation Time %
sha256_regression 1.402s 4%
regression_4709 0.778s 0%
ram_blowup_regression 15.120s 1%
rollup-root 3.680s -12%
rollup-block-merge 3.740s -2%
rollup-base-public 30.400s 3%
rollup-base-private 13.100s 6%
private-kernel-tail 0.997s -7%
private-kernel-reset 6.820s 2%
private-kernel-inner 2.136s -5%

TomAFrench and others added 3 commits December 20, 2024 00:17
* master: (51 commits)
  feat!: type-check trait default methods (#6645)
  feat: `--pedantic-solving` flag (#6716)
  feat!: update `aes128_encrypt` to return an array (#6973)
  fix: wrong module to lookup trait when using crate or super (#6974)
  fix: Start RC at 1 again (#6958)
  feat!: turn TypeIsMorePrivateThenItem into an error (#6953)
  fix: don't fail parsing macro if there are parser warnings (#6969)
  fix: error on missing function parameters (#6967)
  feat: don't report warnings for dependencies (#6926)
  chore: simplify boolean in a mul of a mul (#6951)
  feat(ssa): Immediately simplify away RefCount instructions in ACIR functions (#6893)
  chore: Move comment as part of #6945 (#6959)
  chore: Separate unconstrained functions during monomorphization (#6894)
  feat!: turn CannotReexportItemWithLessVisibility into an error (#6952)
  feat: lock on Nargo.toml on several nargo commands (#6941)
  feat: don't simplify SSA instructions when creating them from a string (#6948)
  chore: add reproduction case for bignum test failure (#6464)
  chore: bump `noir-gates-diff` (#6949)
  feat(test): Enable the test fuzzer for Wasm (#6835)
  chore: also print test output to stdout in CI (#6930)
  ...
Copy link
Contributor

github-actions bot commented Jan 8, 2025

Execution Report

Program Execution Time %
sha256_regression 0.106s 7%
regression_4709 0.001s 0%
ram_blowup_regression 0.579s -1%
rollup-root 0.107s 1%
rollup-block-merge 0.105s -1%
rollup-base-public 1.450s 0%
rollup-base-private 0.643s -1%
private-kernel-tail 0.023s 0%
private-kernel-reset 0.386s -1%
private-kernel-inner 0.117s -1%

Copy link
Contributor

github-actions bot commented Jan 8, 2025

Compilation Memory Report

Program Peak Memory
keccak256 78.50M
workspace 123.14M
regression_4709 422.91M
ram_blowup_regression 1.58G
rollup-base-public 2.56G
rollup-base-private 1.25G
private-kernel-tail 201.98M
private-kernel-reset 716.58M
private-kernel-inner 292.25M

Copy link
Contributor

github-actions bot commented Jan 8, 2025

Execution Memory Report

Program Peak Memory
keccak256 74.59M
workspace 123.69M
regression_4709 315.93M
ram_blowup_regression 512.47M
rollup-base-public 773.35M
rollup-base-private 424.88M
private-kernel-tail 182.01M
private-kernel-reset 255.59M
private-kernel-inner 215.20M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Push ArrayGet instructions backwards through IfElse instructions to avoid expensive array merges
4 participants