Skip to content

Releases: JuliaGPU/Metal.jl

v0.2.0

03 Mar 13:00
Compare
Choose a tag to compare

Metal v0.2.0

Diff since v0.1.2

Closed issues:

  • Threadgroup memory breaks on small datatypes (#26)
  • Int64 not supported on AMD GPUs? (#38)
  • Base.unsafe_convert is ambiguous (#42)
  • Support for multiple devices (#44)
  • Add CITATION file (#55)
  • XGBoost on Metal.jl (#82)
  • first try at metal (#84)
  • Copysign intrinsic possibly wrong (#89)
  • Metal.jl fails to precompile on Linux (#97)
  • Silent failure with unsupported(?) Intel Iris Graphics (#109)
  • I have 2 question about Metal.jl and Flux.jl (#110)

Merged pull requests:

  • Update manifest (#57) (@github-actions[bot])
  • Add GPU profiling capabilities (#58) (@max-Hawkins)
  • Automatically detect if we need cmt build from source. (#59) (@maleadt)
  • Update manifest (#60) (@github-actions[bot])
  • Add queue kernel launch argument (#61) (@tgymnich)
  • Update manifest (#63) (@github-actions[bot])
  • Switch pipeline to juliaecosystem (#64) (@vchuravy)
  • Update manifest (#65) (@github-actions[bot])
  • Add a function for setting the current device (#66) (@maxwindiff)
  • Add documentation webpage (#67) (@max-Hawkins)
  • Wrap simdgroup matrix functions (#70) (@maxwindiff)
  • Support loading/saving simdgroup matrix from threadgroup memory (#71) (@maxwindiff)
  • Conditionalize the MtlDeviceArray element-type workaround. (#72) (@maleadt)
  • Add basic SIMD shuffle up/down (#73) (@max-Hawkins)
  • Update manifest (#74) (@github-actions[bot])
  • Optimize warp reduction for mapreduce (#75) (@max-Hawkins)
  • Specialize GPUArrays.global_index() to improve broadcast performance (#76) (@maxwindiff)
  • Update manifest (#78) (@github-actions[bot])
  • Add initial performance shader support (matmul) (#80) (@max-Hawkins)
  • Use Ninja to build cmt. (#81) (@maleadt)
  • Update manifest (#83) (@github-actions[bot])
  • Support Julia 1.9 (#85) (@maleadt)
  • Add queue parameter to unsafe_copyto (#88) (@tgymnich)
  • Update manifest (#91) (@github-actions[bot])
  • Add MPS tests. (#92) (@maleadt)
  • Support for writing binary archives (#94) (@maleadt)
  • Support precompilation and loading on non-Apple hardware (#98) (@maleadt)
  • Update manifest (#99) (@github-actions[bot])
  • Improve reduce performance by passing CartesianIndices and length statically (#100) (@maxwindiff)
  • Do not release objects that are autoreleased. (#102) (@habemus-papadum)
  • Fix path the cmt in Hacking Section of the Readme (#105) (@habemus-papadum)
  • Add example showing Metal and Gtk4 integration (#106) (@habemus-papadum)
  • Fix memory leak. (#107) (@habemus-papadum)
  • Add a mtl function for simple recursive data conversions. (#114) (@maleadt)
  • Write profile trace in the current folder. (#115) (@maleadt)

v0.1.2

03 Oct 13:24
18da14d
Compare
Choose a tag to compare

Metal v0.1.2

Diff since v0.1.1

Closed issues:

  • installation issue (libz.1.dylib not found) [+workaround] (#51)
  • Optimally choosing threads and grid (#54)

Merged pull requests:

  • Use Base.active_project. (#43) (@maleadt)
  • Update manifest (#45) (@github-actions[bot])
  • Add aliases MtlVector and MtlMatrix (#48) (@amontoison)
  • Update manifest (#49) (@github-actions[bot])
  • Wrap at-metal's output in a let block. (#50) (@maleadt)
  • Update manifest (#52) (@github-actions[bot])
  • Update manifest (#56) (@github-actions[bot])

v0.1.1

10 Jul 12:27
71f05d9
Compare
Choose a tag to compare

Metal v0.1.1

Diff since v0.1.0

Closed issues:

  • Super slow broadcast (#39)

Merged pull requests:

v0.1.0

24 Jun 13:25
7e8bb53
Compare
Choose a tag to compare

Metal v0.1.0

Diff since v0.0.1

v0.0.1

24 Jun 08:13
Compare
Choose a tag to compare

Metal v0.0.1

Closed issues:

  • error when using (#1)
  • Argument buffer encoding is fragile (#5)
  • LLVMType of MtlDeviceArray needs changing/manipulation (#6)
  • Errors running on M1 Max (#14)
  • I get this, my name isn't Tim (#16)
  • Thanks for the previous fix - had a go (#18)
  • Custom IR verification (#25)
  • cmt: Release build fails install (#27)

Merged pull requests: