Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oneDNN v3.7 release notes #2481

Open
wants to merge 14 commits into
base: rls-v3.7
Choose a base branch
from
Open

oneDNN v3.7 release notes #2481

wants to merge 14 commits into from

Conversation

vgvozdeva
Copy link

This PR includes a release notes draft based on the information from the PRs for the contributors to review. Your additions and corrections are highly appreciated.

@vgvozdeva vgvozdeva requested review from a team January 22, 2025 18:35
@vgvozdeva vgvozdeva requested a review from a team as a code owner January 22, 2025 18:36
@github-actions github-actions bot added documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc backport labels Jan 22, 2025
@vgvozdeva vgvozdeva force-pushed the vgvozdeva/release-notes branch from 17acd19 to 23e00b1 Compare January 22, 2025 18:55
RELEASE_NOTES.md Outdated Show resolved Hide resolved
RELEASE_NOTES.md Show resolved Hide resolved
Comment on lines +18 to +20
* [experimental] Extended microkernel API:
Introduced int4 quantization support.
Fpmath mode API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we did that in external microkernel API (and cannot find the related commits).
However we did add a new query for B matrix packing type.

RELEASE_NOTES.md Show resolved Hide resolved
RELEASE_NOTES.md Show resolved Hide resolved
RELEASE_NOTES.md Outdated Show resolved Hide resolved
RELEASE_NOTES.md Show resolved Hide resolved
RELEASE_NOTES.md Show resolved Hide resolved
vgvozdeva and others added 6 commits January 23, 2025 12:01
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
RELEASE_NOTES.md Outdated Show resolved Hide resolved
RELEASE_NOTES.md Outdated Show resolved Hide resolved
RELEASE_NOTES.md Outdated Show resolved Hide resolved
RELEASE_NOTES.md Show resolved Hide resolved
RELEASE_NOTES.md Outdated Show resolved Hide resolved
# Usability
* With SYCL runtime, memory objects on CPU engine are now reference-counted and no more need to be explicitly kept alive by user for the duration of the primitive execution. This align memory object lifetime behavior on CPU and GPU engines.
* Improve verbose diagnostic to better identify issues during dispatching, primitive and kernel creation for CPU primitive and GPU (in case of OpenCL implementation) primitive implementations.
* Improve verbose diagnostic to simplify debugging of nGEN fallbacks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nGEN is an implementation detail. I would suggest calling out specific situations when new diagnostics is triggered.

+@avmanerikar

Suggested change
* Improve verbose diagnostic to simplify debugging of nGEN fallbacks.
* Improved verbose diagnostics for Intel GPU driver compatibility issues.

@@ -0,0 +1,44 @@
# Performance Optimizations
## Intel Architecture Processors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tprimak, I think we had a bit more improvements for x64. Please review and update this section.

* Improved fp16/bf16 softmax performance with relaxed [accumulation mode](https://oneapi-src.github.io/oneDNN/dev_guide_attributes_accumulation_mode.html#doxid-dev-guide-attributes-accumulation-mode).
* Added support and improved perfomance for fp8 matmul with bf16/fp16.

## Intel Graphics Products
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karturov, please review and update this section.

RELEASE_NOTES.md Outdated Show resolved Hide resolved
* Improved performance of the following subgraphs with Graph API
* Scaled dot-product Attention (SDPA) [with causal mask](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa.html#doxid-dev-guide-graph-sdpa)
* Scaled dot-product Attention (SDPA) [with compressed key and value](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa_compressed_kv.html#doxid-dev-guide-graph-sdpa-compressed-kv)
## AArch64-based Processors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jondea, @theComputeKid, could you please help summarizing AArch64 improvements?

vgvozdeva and others added 3 commits January 23, 2025 21:28
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
vgvozdeva and others added 4 commits January 23, 2025 21:31
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants