You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue lists all feature requests and improvements slated for the Nov 2024 Tkw release.
Ensure that mappings modify the index sequence
Paper Submission
Masking & Mapping Section
IGEMM Performance Results
Paper - Language (operators & semantics, constraints)
Paper - Compiler & optimizations
Paper - Runtime
Paper - Sample kernels
Paper - Shape & type propagation
Broadcast Support (thread-shape analysis)
Reduction on non-accumulator (type fixes)
Chained Matmul
MMA + Reduction (handling mma and vector shapes)
Flash Attention Implementation
Flash Attention Performance Improvements
IGEMM SDXL Shapes Functionality
IGEMM Shared Memory Optimizations
IGEMM Performance Parity with Tuned IREE on SDXL Shapes
Compare IGEMM Performance with CK on SDXL Shapes
Obtain GEMM shapes (SDXL, LLAMA shapes ~ 20)
Dynamic shapes for GEMMs
Add LLVM scheduling intrinsics for GEMMs
Add shared memory bank conflict resolution with padding
GEMM Performance Parity with Tuned IREE
Compare GEMM Performance with hipblasLT
GEMM & IGEMM Tuning Capability
GEMM Non-temporal loads
GEMM + SiLU fusion kernel
Performance Dashboard
Auto-tuning Capability
Batch GEMM support
Unaligned shapes for GEMMs
GEMM with fused elementwise operations
MoE Kernel
Temporary File API for benchmarking
Use dlpack instead of numpy to copy data between torch and iree
Debugger support (add breakpoints and inspect stack on GPU)
Profiling support
================================================
Week 1 (Oct 5th) First version of paper with description of language, (mapping & masking), operators in the language,
compiler optimizations.
Performance comparisons in IGEMM, GEMM.
Flash Attention working.
MI250 & MI300.
Week 2 (Oct 12th)
Paper complete with sections on language, compiler.
FA working.
FP8 GEMM working.
IGEMM fixes landed.
Implement prefill , extend and decode attention and get functional.
Week 3 (Oct 18th)
Week 4 (Oct 25th)
Deadline (Oct 31)
The text was updated successfully, but these errors were encountered:
harsh-nod
changed the title
November 2024 Release
November 2024 Tkw Release Notes
Oct 1, 2024
This issue lists all feature requests and improvements slated for the Nov 2024 Tkw release.
================================================
Week 1 (Oct 5th)
First version of paper with description of language, (mapping & masking), operators in the language,compiler optimizations.
Performance comparisons in IGEMM, GEMM.
Flash Attention working.
MI250 & MI300.
Week 2 (Oct 12th)
Paper complete with sections on language, compiler.
FA working.
FP8 GEMM working.
IGEMM fixes landed.
Implement prefill , extend and decode attention and get functional.
Week 3 (Oct 18th)
Week 4 (Oct 25th)
Deadline (Oct 31)
The text was updated successfully, but these errors were encountered: