Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable size optimization #16

Open
jyao1 opened this issue Jun 7, 2021 · 1 comment
Open

enable size optimization #16

jyao1 opened this issue Jun 7, 2021 · 1 comment
Assignees

Comments

@jyao1
Copy link
Owner

jyao1 commented Jun 7, 2021

ref: https://docs.rust-embedded.org/book/unsorted/speed-vs-size.html

@kailun-qin
Copy link
Collaborator

Enable Size Optimization

Adjust opt-level

If you want your release binaries to be optimized for size then change the profile.release.opt-level setting in Cargo.toml as shown below.

[profile.release]
# or "z"
opt-level = "s"

Note:
If opt-level is not specified, the release build will default to opt-level = 3.

Flags

This opt-level flag controls the optimization level.

  • 0: no optimizations
  • 1: basic optimizations
  • 2: some optimizations
  • 3: all optimizations
  • s: optimize for binary size
  • z: optimize for binary size, but also turn off loop vectorization.

Results

Based on commit id: aa833b38f6c16b7023035c78cf72a5126db71486.

| opt-level   | rust-ipl    | rust-uefi-payload |
| ----------- | ----------- | ----------------- |
| 0           | 311K        | 320K              |
| 1           | 145K        | 158K              |
| 2           |  77K        | 113K              |
| 3           |  78K        | 118K              |
| s           |  75K        | 105K              |
| z           |  80K        | 105K              |

Use strip

Use the strip tool to remove symbols and/or sections.

strip -s <binary>

Not much effect has been seen on release build binaries.

Enable LTO

The lto setting controls the -C lto flag which controls LLVM's link time optimizations. LTO can produce better optimized code, using whole-program analysis, at the cost of longer linking time.

To enable, change the profile.release.lto setting in Cargo.toml as shown below.

[profile.release]
lto = true

Valid Options

  • false: Performs "thin local LTO" which performs "thin" LTO on the local crate only across its codegen units. No LTO is performed if codegen units is 1 or opt-level is 0.
  • true or "fat": Performs "fat" LTO which attempts to perform optimizations across all crates within the dependency graph.
  • "thin": Performs "thin" LTO. This is similar to "fat", but takes substantially less time to run while still achieving performance gains similar to "fat".
  • "off": Disables LTO.

Results

Based on commit id: aa833b38f6c16b7023035c78cf72a5126db71486.
Note:
opt-level=0: disable LTO.

| opt-level   | rust-ipl    | rust-uefi-payload |
| ----------- | ----------- | ----------------- |
| 0           |   NA        |   NA              |
| 1           | 148K        | 159K              |
| 2           |  51K        | 106K              |
| 3           |  54K        | 110K              |
| s           |  46K        |  91K              |
| z           |  67K        |  93K              |

Disable stack unwinding upon panic

This option lets you control what happens when the code panics.

To disable stack unwinding, change the profile.release.panic setting in Cargo.toml as shown below.

[profile.release]
panic = abort

Options

  • abort: terminate the process upon panic
  • unwind: unwind the stack upon panic

If not specified, the default depends on the target.

Result

Not much effect has been seen on release build binaries.

Adjust codegen-units

This codegen-units flag controls how many code generation units the crate is split into. It takes an integer greater than 0.

When a crate is split into multiple codegen units, LLVM is able to process them in parallel. Increasing parallelism may speed up compile times, but may also produce slower code. Setting this to 1 may improve the performance of generated code, but may be slower to compile.

The default value, if not specified, is 16 for non-incremental builds. For incremental builds the default is 256 which allows caching to be more granular.

To limit the number of codegen-units to 1, change the profile.release.codegen-units setting in Cargo.toml as shown below.

[profile.release]
codegen-units = 1

Results

Based on commit id: aa833b38f6c16b7023035c78cf72a5126db71486, with lto enabled.
Note:
codegen-units=0: disable LTO.

| opt-level   | rust-ipl    | rust-uefi-payload |
| ----------- | ----------- | ----------------- |
| 0           |   NA        |   NA              |
| 1           | 143K        | 157K              |
| 2           |  57K        | 107K              |
| 3           |  52K        | 110K              |
| s           |  46K        |  92K              |
| z           |  68K        |  94K              |

Minimize Dependencies

Remove Unnecessary Dependencies

  1. Install cargo-deps and graphviz for building dependency graphs.

  2. Next, cd into the Rust project and run:

cargo deps | dot -Tpng > dep.png
  1. Analyze the build-dependency graph, try to remove and/or replace large crate with basic crates.

Disable Unnecessary Features

References

[1] https://docs.rust-embedded.org/book/unsorted/speed-vs-size.html
[2] https://doc.rust-lang.org/cargo/reference/profiles.html
[3] https://doc.rust-lang.org/rustc/codegen-options/index.html

kailun-qin added a commit that referenced this issue Jun 25, 2021
Related to #16.

Signed-off-by: Kailun Qin <kailun.qin@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants