Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
make inductor work with new triton compile interface (#115878)
Summary: Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API. Also there is some simplification between compilation call in subprocess and the one in main process - previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that - previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process. Updated: There are more interface change from triton side. E.g. - tl.math.{min, max} now requires a propagate_nan argument - JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton. - triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly. X-link: pytorch/pytorch#115878 Approved by: https://github.com/jansel Reviewed By: jeanschmidt Differential Revision: D52390214 Pulled By: shunting314 fbshipit-source-id: aca5d42e5977373869719564dc570774d5db1642
- Loading branch information