forked from openxla/xla
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rocm jaxlib v0.4.30 qa nccl maxnchannels #75
Closed
hsharsha
wants to merge
46
commits into
rocm-jaxlib-v0.4.30
from
rocm-jaxlib-v0.4.30-qa_nccl_maxnchannels
Closed
Rocm jaxlib v0.4.30 qa nccl maxnchannels #75
hsharsha
wants to merge
46
commits into
rocm-jaxlib-v0.4.30
from
rocm-jaxlib-v0.4.30-qa_nccl_maxnchannels
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Imported from GitHub PR openxla#15311 @xla-rotation Copybara import of the project: -- 2c4cee2 by Chao Chen <cchen104@amd.com>: unified memory for rocm Merging this change closes openxla#15311 COPYBARA_INTEGRATE_REVIEW=openxla#15311 from ROCm:ci_rocm_unify_mem 2c4cee2 PiperOrigin-RevId: 657168704
PR openxla#15311: [ROCm] GPU/CPU unified memory for rocm
…-copy Let the other stream wait for the main stream before issuing memcpy d2h
Main changes include: * Added support for fp8 matmul with output data type to be fp8 and bf16. * Added buffer comparators for fp8e4m3fnuz and fp8e5m2fnuz
…factoring, added verbose flag
…factoring, added verbose flag
Rocm jaxlib v0.4.30 qa autotuning
Rocm jaxlib v0.4.30 qa cleanup
Replace "Navi" with corresponding public product names
…on unit tests Imported from GitHub PR openxla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla#10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes openxla#16938 COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676615012
Add NANOO FP8 support for collaborative communication unit tests
[ROCm] Include clang-19 and clang-20 headers
* reset blas stream used by gemm_algorithm_picker * small refactoring * fixing clang format * fixing clang format * fixing clang format --------- Co-authored-by: Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>
…ocblas_get_version_string (#52)
…ble-triton Add multigpu script and disable triton tests
[ROCm] Added include of hipblas.h in hipblaslt_wrapper.h
buffer init fix and gpu_hlo_runner test
* PR openxla#14605: [ROCm] Switch on Triton feature for ROCm. Imported from GitHub PR openxla#14605 Last in series of commits to switch on Triton in XLA for ROCm. This is new version of: openxla#13003 Changes in third_party/triton/temporary/amd_pr7.patch are already merged on: triton-lang/triton#4238 Copybara import of the project: -- c2ce7e0 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Switch on Triton feature for ROCm. -- 563b303 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Fixed an issue with test cases from ir_emitter_triton_test.cc -- a4d2ad8 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Fixed an issue with gpu_compiler_test.cc -- a1b9260 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Applied comments from code review. -- c694a95 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Fixed failed tests because of openxla@19c11ba -- 7359619 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Fixed compilation issue with latest rebase. -- 82f58ce by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Skip SplitLHSInputOutputIsFused test in ir_emitter_triton_test.cc untill issue is fixed. -- 57e776b by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Triton related changes merged thus removed amd_pr7.patch -- 0d09d0e by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Applied comments from code review. -- 7b11147 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Applied comments from code review. -- 9e7e0c7 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Modified TestNoAutotuner test case. Merging this change closes openxla#14605 COPYBARA_INTEGRATE_REVIEW=openxla#14605 from ROCm:rocm_triton_backend_8 9e7e0c7 PiperOrigin-RevId: 652449567 * Fixed test issues.
[ROCm] Fixed linker issues related to fp8 buffer_comparator functions
Passing amdgpu targets to crosstool wrapper which calls hipcc can restrict the kernels generated to specific set of supported amdgpu architectures.
Merge fixes to 31 QA
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.