Results, statistics and graphs for this week.
- WP2
- Added optimizations to
vle.v
andvse.v
helper functions that improves performance ofmemcpy
with odd numbers of bytes. - Started optimization of
TranslationBlock
(TB) access functions.- RVV calls to TB access functions are in
do_vsetvl
anddo_vsetivli
. cpu_get_tb_cpu_state
accounts for 3-24% of execution time withmemcpy
,helper_lookup_tb_ptr
accounts for 6-46% of execution time withmemcpy
, with the larger values being for small data sizes and small vectors (all data for latest optimized QEMU).- our initial approach of inlining these calls didn't produce significant improvements, we continue to work on this.
- RVV calls to TB access functions are in
- Created initial optimization for
vsetvl
instructions, given that profiling shows that its helper accounts for 2-12% of execution time inmemcpy
.- no performance data let.
- Added optimizations to
-
WP2
- Explore new approaches to optimizing
TranslationBlock
access. - Measure effectiveness of
vsetvl
optimization and extend this work. - Continue optimization of the
vle.v
andvse.v
helper functions:
- Explore new approaches to optimizing
-
Other
- Following discussion with RISE, we will switch to weekly calls.
- Paul Walmsley to send a follow up email about proposed tasks/priorities.
- Jeremy Bennett will be away at conferences 12-16 September
- includes GNU Tools Cauldron 14-16 September
- Paolo Savini will be on vacation 20-24 and 27 September