-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oneDNN v3.7 release notes #2481
base: rls-v3.7
Are you sure you want to change the base?
Conversation
17acd19
to
23e00b1
Compare
* [experimental] Extended microkernel API: | ||
Introduced int4 quantization support. | ||
Fpmath mode API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we did that in external microkernel API (and cannot find the related commits).
However we did add a new query for B matrix packing type.
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
# Usability | ||
* With SYCL runtime, memory objects on CPU engine are now reference-counted and no more need to be explicitly kept alive by user for the duration of the primitive execution. This align memory object lifetime behavior on CPU and GPU engines. | ||
* Improve verbose diagnostic to better identify issues during dispatching, primitive and kernel creation for CPU primitive and GPU (in case of OpenCL implementation) primitive implementations. | ||
* Improve verbose diagnostic to simplify debugging of nGEN fallbacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nGEN is an implementation detail. I would suggest calling out specific situations when new diagnostics is triggered.
* Improve verbose diagnostic to simplify debugging of nGEN fallbacks. | |
* Improved verbose diagnostics for Intel GPU driver compatibility issues. |
@@ -0,0 +1,44 @@ | |||
# Performance Optimizations | |||
## Intel Architecture Processors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tprimak, I think we had a bit more improvements for x64. Please review and update this section.
* Improved fp16/bf16 softmax performance with relaxed [accumulation mode](https://oneapi-src.github.io/oneDNN/dev_guide_attributes_accumulation_mode.html#doxid-dev-guide-attributes-accumulation-mode). | ||
* Added support and improved perfomance for fp8 matmul with bf16/fp16. | ||
|
||
## Intel Graphics Products |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karturov, please review and update this section.
* Improved performance of the following subgraphs with Graph API | ||
* Scaled dot-product Attention (SDPA) [with causal mask](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa.html#doxid-dev-guide-graph-sdpa) | ||
* Scaled dot-product Attention (SDPA) [with compressed key and value](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa_compressed_kv.html#doxid-dev-guide-graph-sdpa-compressed-kv) | ||
## AArch64-based Processors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jondea, @theComputeKid, could you please help summarizing AArch64 improvements?
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
This PR includes a release notes draft based on the information from the PRs for the contributors to review. Your additions and corrections are highly appreciated.