Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimize CPU and Memory performance for Resize linear mode parser
Re-write calc_neighbor_points() by composing index from binary bits instead of recursion. With the optimized calc_neighbor_points(), CPU time required by 90% and peak memory utilization is significantly reduced. Perf. comparision on VM w/ 12-Core EPYC 9V64 + 128 GB mem: n_dim out_elements New t-CPU (us) Old t-CPU (us) t-CPU Ratio ------- -------------- ---------------- ---------------- ------------- 4 786,432 170,377 1,878,299 0.0907 4 1,572,864 383,125 4,009,335 0.0956 4 3,145,728 784,388 7,670,960 0.1023 4 6,291,456 1,567,753 15,095,017 0.1039 4 12,582,912 3,139,452 29,622,921 0.1060 4 25,165,824 6,266,153 58,332,233 0.1074 4 50,331,648 12,517,674 116,923,368 0.1071 4 100,663,296 25,011,425 OOM Kill N/A Signed-off-by: Colin Xu <Colin.Xu@amd.com>
- Loading branch information