15 Oct 15:20

shi-eric

v1.4.1 Latest

Latest

Changelog

[1.4.1] - 2024-10-15

Fixed

Fix iter_reverse() not working as expected for ranges with steps other than 1 (GH-311).
Fix potential out-of-bounds memory access when a wp.sparse.BsrMatrix object is reused for storing matrices of different shapes.
Fix robustness to very low desired tolerance in wp.fem.utils.symmetric_eigenvalues_qr.
Fix invalid code generation error messages when nesting dynamic and static for-loops.
Fix caching of kernels with static expressions.
Fix ModelBuilder.add_builder(builder) to correctly update articulation_start and thereby articulation_count when builder contains more than one articulation.
Re-introduced the wp.rand*(), wp.sample*(), and wp.poisson() onto the Python scope to revert a breaking change.

Assets 9

01 Oct 07:23

shi-eric

v.1.4.0

CHANGELOG

[1.4.0] - 2024-10-01

Added

Support for a new wp.static(expr) function that allows arbitrary Python expressions to be evaluated at the time of
function/kernel definition (docs).
Support for stream priorities to hint to the device that it should process pending work
in high-priority streams over pending work in low-priority streams when possible
(docs).
Adaptive sparse grid geometry to warp.fem (docs).
Support for defining wp.kernel and wp.func objects from within closures.
Support for defining multiple versions of kernels, functions, and structs without manually assigning unique keys.
Support for default argument values for user functions decorated with wp.func.
Allow passing custom launch dimensions to jax_kernel() (GH-310).
JAX interoperability examples for sharding and matrix multiplication (docs).
Interoperability support for the PaddlePaddle ML framework (GH-318).
Support wp.mod() for vector types (GH-282).
Expose the modulo operator % to Python's runtime scalar and vector types.
Support for fp64 atomic_add, atomic_max, and atomic_min (GH-284).
Support for quaternion indexing (e.g. q.w).
Support shadowing builtin functions (GH-308).
Support for redefining function overloads.
Add an ocean sample to the omni.warp extension.
warp.sim.VBDIntegrator now supports body-particle collision.
Add a contributing guide to the Sphinx docs .
Add documentation for dynamic code generation (docs).

Changed

wp.sim.Model.edge_indices now includes boundary edges.
Unexposed wp.rand*(), wp.sample*(), and wp.poisson() from the Python scope.
Skip unused functions in module code generation, improving performance.
Avoid reloading modules if their content does not change, improving performance.
wp.Mesh.points is now a property instead of a raw data member, its reference can be changed after the mesh is initialized.
Improve error message when invalid objects are referenced in a Warp kernel.
if/else/elif statements with constant conditions are resolved at compile time with no branches being inserted in the generated code.
Include all non-hidden builtins in the stub file.
Improve accuracy of symmetric eigenvalues routine in warp.fem.

Fixed

Fix for wp.func erroring out when defining a Tuple as a return type hint (GH-302).
Fix array in-place op (+=, -=) adjoints to compute gradients correctly in the backwards pass
Fix vector, matrix in-place assignment adjoints to compute gradients correctly in the backwards pass, e.g.: v[1] = x
Fix a bug in which Python docstrings would be created as local function variables in generated code.
Fix a bug with autograd array access validation in functions from different modules.
Fix a rare crash during error reporting on some systems due to glibc mismatches.
Handle --num_tiles 1 in example_render_opengl.py (GH-306).
Fix the computation of body contact forces in FeatherstoneIntegrator when bodies and particles collide.
Fix bug in FeatherstoneIntegrator where eval_rigid_jacobian could give incorrect results or reach an infinite
loop when the body and joint indices were not in the same order. Added Model.joint_ancestor to fix the indexing
from a joint to its parent joint in the articulation.
Fix wrong vertex index passed to add_edges() called from ModelBuilder.add_cloth_mesh() (GH-319).
Add a workaround for uninitialized memory read warning in the compute-sanitizer initcheck tool when using wp.Mesh.
Fix name clashes when Warp functions and structs are returned from Python functions multiple times.
Fix name clashes between Warp functions and structs defined in different modules.
Fix code generation errors when overloading generic kernels defined in a Python function.
Fix issues with unrelated functions being treated as overloads (e.g., closures).
Fix handling of stream argument in array.__dlpack__().
Fix a bug related to reloading CPU modules.
Fix a crash when kernel functions are not found in CPU modules.
Fix conditions not being evaluated as expected in while statements.
Fix printing Boolean and 8-bit integer values.
Fix array interface type strings used for Boolean and 8-bit integer values.
Fix initialization error when setting struct members.
Fix Warp not being initialized upon entering a wp.Tape context.
Use kDLBool instead of kDLUInt for DLPack interop of Booleans.

Assets 9

04 Sep 20:54

shi-eric

v1.3.3

[1.3.3] - 2024-09-04

Bug fixes
- Fix an aliasing issue with zero-copy array initialization from NumPy introduced in Warp 1.3.0.
- Fix wp.Volume.load_from_numpy() behavior when bg_value is a sequence of values.

[1.3.2] - 2024-08-30

Bug fixes
- Fix accuracy of 3x3 SVD wp.svd3 with fp64 numbers (GH-281).
- Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in wp.bvh_query_ray() where the direction instead of the reciprocal direction was used
  (GH-288).
- Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
  will no longer be unloaded before the graph is released.
- Fix a bug in wp.sim.collide.triangle_closest_point_barycentric() where the returned barycentric coordinates may be
  incorrect when the closest point lies on an edge.
- Fix 32-bit overflow when array shape is specified using np.int32.
- Fix handling of integer indices in the input_output_mask argument to autograd.jacobian and
  autograd.jacobian_fd (GH-289).
- Fix ModelBuilder.collapse_fixed_joints() to correctly update the body centers of mass and the
  ModelBuilder.articulation_start array.
- Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in wp.fem.ExplicitQuadrature (regression from 1.3.0).
Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that wp.bvh_query_aabb() returns parts that overlap the bounding volume.

[1.3.1] - 2024-07-27

Remove wp.synchronize() from PyTorch autograd function example
Tape.check_kernel_array_access() and Tape.reset_array_read_flags() are now private methods.
Fix reporting unmatched argument types

[1.3.0] - 2024-07-25

Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
  compiled (compiled), loaded from the cache (cached), or was unable to be
  loaded (error).
- wp.config.verbose = True now also prints out a message upon the entry to a wp.ScopedTimer.
- Add wp.clear_kernel_cache() to the public API. This is equivalent to wp.build.clear_kernel_cache().
- Add code-completion support for wp.config variables.
- Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update wp.matmul() CPU fallback to use dtype explicitly in np.matmul() call
- Add support for PEP 563's from __future__ import annotations (GH-256).
- Allow passing external arrays/tensors to wp.launch() directly via __cuda_array_interface__ and __array_interface__, up to 2.5x faster conversion from PyTorch
- Add faster Torch interop path using return_ctype argument to wp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add wp.abs() and wp.sign() for vector types
- Expose scalar arithmetic operators to Python's runtime (e.g.: wp.float16(1.23) * wp.float16(2.34))
- Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating wp.copy(), wp.clone(), and array.assign() differentiability
- Add __new__() methods for all class __del__() methods to handle when a class instance is created but not instantiated before garbage collection
- Implement the assignment operator for wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their _t suffix: wp.BVHQuery, wp.HashGridQuery, wp.MeshQueryAABB, wp.MeshQueryPoint, and wp.MeshQueryRay
- Add wp.array(ptr=...) to allow initializing arrays from pointer addresses inside of kernels (GH-206)
warp.autograd improvements:
- New warp.autograd module with utility functions gradcheck(), jacobian(), and jacobian_fd() for debugging kernel Jacobians (docs)
- Add array overwrite detection, if wp.config.verify_autograd_array_access is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs)
- Fix bug where modification of @wp.func_replay functions and native snippets would not trigger module recompilation
- Add documentation for dynamic loop autograd limitations
warp.sim improvements:
- Improve memory usage and performance for rigid body contact handling when self.rigid_mesh_contact_max is zero (default behavior).
- The mask argument to wp.sim.eval_fk() now accepts both integer and boolean arrays to mask articulations.
- Fix handling of ModelBuilder.joint_act in ModelBuilder.collapse_fixed_joints() (affected floating-base systems)
- Fix and improve implementation of ModelBuilder.plot_articulation() to visualize the articulation tree of a rigid-body mechanism
- Fix ShapeInstancer __new__() method (missing instance return and *args parameter)
- Fix handling of upaxis variable in ModelBuilder and the rendering thereof in OpenGLRenderer
warp.sparse improvements:
- Sparse matrix allocations (from bsr_from_triplets(), bsr_axpy(), etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously.
- bsr_assign() now supports changing block shape (including CSR/BSR conversions)
- Add Python operator overloads for common sparse matrix operations, e.g A += 0.5 * B, y = x @ C
warp.fem new features and fixes:
- Support for variable number of nodes per element
- Global wp.fem.lookup() operator now supports wp.fem.Tetmesh and wp.fem.Trimesh2D geometries
- Simplified defining custom subdomains (wp.fem.Subdomain), free-slip boundary conditions
- New field types: wp.fem.UniformField, wp.fem.ImplicitField and wp.fem.NonconformingField
- New streamlines, magnetostatics and nonconforming_contact examples, updated mixed_elasticity to use a nonlinear model
- Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of wp.fem.PicQuadrature w.r.t. positions and measures

Assets 9

30 Aug 15:32

shi-eric

v1.3.2

[1.3.2] - 2024-08-30

Bug fixes
- Fix accuracy of 3x3 SVD wp.svd3 with fp64 numbers (GH-281).
- Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in wp.bvh_query_ray() where the direction instead of the reciprocal direction was used
  (GH-288).
- Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
  will no longer be unloaded before the graph is released.
- Fix a bug in wp.sim.collide.triangle_closest_point_barycentric() where the returned barycentric coordinates may be
  incorrect when the closest point lies on an edge.
- Fix 32-bit overflow when array shape is specified using np.int32.
- Fix handling of integer indices in the input_output_mask argument to autograd.jacobian and
  autograd.jacobian_fd (GH-289).
- Fix ModelBuilder.collapse_fixed_joints() to correctly update the body centers of mass and the
  ModelBuilder.articulation_start array.
- Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in wp.fem.ExplicitQuadrature (regression from 1.3.0).
Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that wp.bvh_query_aabb() returns parts that overlap the bounding volume.

[1.3.1] - 2024-07-27

Remove wp.synchronize() from PyTorch autograd function example
Tape.check_kernel_array_access() and Tape.reset_array_read_flags() are now private methods.
Fix reporting unmatched argument types

[1.3.0] - 2024-07-25

Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
  compiled (compiled), loaded from the cache (cached), or was unable to be
  loaded (error).
- wp.config.verbose = True now also prints out a message upon the entry to a wp.ScopedTimer.
- Add wp.clear_kernel_cache() to the public API. This is equivalent to wp.build.clear_kernel_cache().
- Add code-completion support for wp.config variables.
- Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update wp.matmul() CPU fallback to use dtype explicitly in np.matmul() call
- Add support for PEP 563's from __future__ import annotations (GH-256).
- Allow passing external arrays/tensors to wp.launch() directly via __cuda_array_interface__ and __array_interface__, up to 2.5x faster conversion from PyTorch
- Add faster Torch interop path using return_ctype argument to wp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add wp.abs() and wp.sign() for vector types
- Expose scalar arithmetic operators to Python's runtime (e.g.: wp.float16(1.23) * wp.float16(2.34))
- Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating wp.copy(), wp.clone(), and array.assign() differentiability
- Add __new__() methods for all class __del__() methods to handle when a class instance is created but not instantiated before garbage collection
- Implement the assignment operator for wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their _t suffix: wp.BVHQuery, wp.HashGridQuery, wp.MeshQueryAABB, wp.MeshQueryPoint, and wp.MeshQueryRay
- Add wp.array(ptr=...) to allow initializing arrays from pointer addresses inside of kernels (GH-206)

Assets 12

28 Jul 05:18

c0d1f1ed

v1.3.1

[1.3.1] - 2024-07-27

Remove wp.synchronize() from PyTorch autograd function example
Tape.check_kernel_array_access() and Tape.reset_array_read_flags() are now private methods.
Fix reporting unmatched argument types

[1.3.0] - 2024-07-25

Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
  compiled (compiled), loaded from the cache (cached), or was unable to be
  loaded (error).
- wp.config.verbose = True now also prints out a message upon the entry to a wp.ScopedTimer.
- Add wp.clear_kernel_cache() to the public API. This is equivalent to wp.build.clear_kernel_cache().
- Add code-completion support for wp.config variables.
- Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update wp.matmul() CPU fallback to use dtype explicitly in np.matmul() call
- Add support for PEP 563's from __future__ import annotations (GH-256).
- Allow passing external arrays/tensors to wp.launch() directly via __cuda_array_interface__ and __array_interface__, up to 2.5x faster conversion from PyTorch
- Add faster Torch interop path using return_ctype argument to wp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add wp.abs() and wp.sign() for vector types
- Expose scalar arithmetic operators to Python's runtime (e.g.: wp.float16(1.23) * wp.float16(2.34))
- Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating wp.copy(), wp.clone(), and array.assign() differentiability
- Add __new__() methods for all class __del__() methods to handle when a class instance is created but not instantiated before garbage collection
- Implement the assignment operator for wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their _t suffix: wp.BVHQuery, wp.HashGridQuery, wp.MeshQueryAABB, wp.MeshQueryPoint, and wp.MeshQueryRay
- Add wp.array(ptr=...) to allow initializing arrays from pointer addresses inside of kernels (GH-206)
warp.autograd improvements:
- New warp.autograd module with utility functions gradcheck(), jacobian(), and jacobian_fd() for debugging kernel Jacobians (docs)
- Add array overwrite detection, if wp.config.verify_autograd_array_access is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs)
- Fix bug where modification of @wp.func_replay functions and native snippets would not trigger module recompilation
- Add documentation for dynamic loop autograd limitations
warp.sim improvements:
- Improve memory usage and performance for rigid body contact handling when self.rigid_mesh_contact_max is zero (default behavior).
- The mask argument to wp.sim.eval_fk() now accepts both integer and boolean arrays to mask articulations.
- Fix handling of ModelBuilder.joint_act in ModelBuilder.collapse_fixed_joints() (affected floating-base systems)
- Fix and improve implementation of ModelBuilder.plot_articulation() to visualize the articulation tree of a rigid-body mechanism
- Fix ShapeInstancer __new__() method (missing instance return and *args parameter)
- Fix handling of upaxis variable in ModelBuilder and the rendering thereof in OpenGLRenderer
warp.sparse improvements:
- Sparse matrix allocations (from bsr_from_triplets(), bsr_axpy(), etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously.
- bsr_assign() now supports changing block shape (including CSR/BSR conversions)
- Add Python operator overloads for common sparse matrix operations, e.g A += 0.5 * B, y = x @ C
warp.fem new features and fixes:
- Support for variable number of nodes per element
- Global wp.fem.lookup() operator now supports wp.fem.Tetmesh and wp.fem.Trimesh2D geometries
- Simplified defining custom subdomains (wp.fem.Subdomain), free-slip boundary conditions
- New field types: wp.fem.UniformField, wp.fem.ImplicitField and wp.fem.NonconformingField
- New streamlines, magnetostatics and nonconforming_contact examples, updated mixed_elasticity to use a nonlinear model
- Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of wp.fem.PicQuadrature w.r.t. positions and measures

Assets 9

26 Jul 04:43

c0d1f1ed

v1.3.0

[1.3.0] - 2024-07-25

Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
  compiled (compiled), loaded from the cache (cached), or was unable to be
  loaded (error).
- wp.config.verbose = True now also prints out a message upon the entry to a wp.ScopedTimer.
- Add wp.clear_kernel_cache() to the public API. This is equivalent to wp.build.clear_kernel_cache().
- Add code-completion support for wp.config variables.
- Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update wp.matmul() CPU fallback to use dtype explicitly in np.matmul() call
- Add support for PEP 563's from __future__ import annotations (GH-256).
- Allow passing external arrays/tensors to wp.launch() directly via __cuda_array_interface__ and __array_interface__, up to 2.5x faster conversion from PyTorch
- Add faster Torch interop path using return_ctype argument to wp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add wp.abs() and wp.sign() for vector types
- Expose scalar arithmetic operators to Python's runtime (e.g.: wp.float16(1.23) * wp.float16(2.34))
- Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating wp.copy(), wp.clone(), and array.assign() differentiability
- Add __new__() methods for all class __del__() methods to handle when a class instance is created but not instantiated before garbage collection
- Implement the assignment operator for wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their _t suffix: wp.BVHQuery, wp.HashGridQuery, wp.MeshQueryAABB, wp.MeshQueryPoint, and wp.MeshQueryRay
- Add wp.array(ptr=...) to allow initializing arrays from pointer addresses inside of kernels (GH-206)
warp.autograd improvements:
- New warp.autograd module with utility functions gradcheck(), jacobian(), and jacobian_fd() for debugging kernel Jacobians (docs)
- Add array overwrite detection, if wp.config.verify_autograd_array_access is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs)
- Fix bug where modification of @wp.func_replay functions and native snippets would not trigger module recompilation
- Add documentation for dynamic loop autograd limitations
warp.sim improvements:
- Improve memory usage and performance for rigid body contact handling when self.rigid_mesh_contact_max is zero (default behavior).
- The mask argument to wp.sim.eval_fk() now accepts both integer and boolean arrays to mask articulations.
- Fix handling of ModelBuilder.joint_act in ModelBuilder.collapse_fixed_joints() (affected floating-base systems)
- Fix and improve implementation of ModelBuilder.plot_articulation() to visualize the articulation tree of a rigid-body mechanism
- Fix ShapeInstancer __new__() method (missing instance return and *args parameter)
- Fix handling of upaxis variable in ModelBuilder and the rendering thereof in OpenGLRenderer
warp.sparse improvements:
- Sparse matrix allocations (from bsr_from_triplets(), bsr_axpy(), etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously.
- bsr_assign() now supports changing block shape (including CSR/BSR conversions)
- Add Python operator overloads for common sparse matrix operations, e.g A += 0.5 * B, y = x @ C
warp.fem new features and fixes:
- Support for variable number of nodes per element
- Global wp.fem.lookup() operator now supports wp.fem.Tetmesh and wp.fem.Trimesh2D geometries
- Simplified defining custom subdomains (wp.fem.Subdomain), free-slip boundary conditions
- New field types: wp.fem.UniformField, wp.fem.ImplicitField and wp.fem.NonconformingField
- New streamlines, magnetostatics and nonconforming_contact examples, updated mixed_elasticity to use a nonlinear model
- Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of wp.fem.PicQuadrature w.r.t. positions and measures

Assets 9

04 Jul 19:07

c0d1f1ed

v1.2.2

[1.2.2] - 2024-07-04

Support for NumPy >= 2.0

[1.2.1] - 2024-06-14

Fix generic function caching
Fix Warp not being initialized when constructing arrays with wp.array()
Fix wp.is_mempool_access_supported() not resolving the provided device arguments to wp.context.Device

[1.2.0] - 2024-06-06

Add a not-a-number floating-point constant that can be used as wp.NAN or wp.nan.
Add wp.isnan(), wp.isinf(), and wp.isfinite() for scalars, vectors, matrices, etc.
Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by all wp.constant() variables declared in a Warp program.
Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory.
Add runtime checks for wp.MarchingCubes on field dimensions and size
Fix memory leak in wp.Mesh BVH (GH-225)
Use C++17 when building the Warp library and user kernels
Increase PTX target architecture up to sm_75 (from sm_70), enabling Turing ISA features
Extended NanoVDB support (see warp.Volume):
- Add support for data-agnostic index grids, allocation at voxel granularity
- New wp.volume_lookup_index(), wp.volume_sample_index() and generic wp.volume_sample()/wp.volume_lookup()/wp.volume_store() kernel-level functions
- Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
- warp.fem can now work directly on NanoVDB grids using warp.fem.Nanogrid
- Fixed wp.volume_sample_v() and wp.volume_store_*() adjoints
- Prevent wp.volume_store() from overwriting grid background values
Improve validation of user-provided fields and values in warp.fem
Support headless rendering of wp.render.OpenGLRenderer via pyglet.options["headless"] = True
wp.render.RegisteredGLBuffer can fall back to CPU-bound copying if CUDA/OpenGL interop is not available
Clarify terms for external contributions, please see CONTRIBUTING.md for details
Improve performance of wp.sparse.bsr_mm() by ~5x on benchmark problems
Fix for XPBD incorrectly indexing into of joint actuations joint_act arrays
Fix for mass matrix gradients computation in wp.sim.FeatherstoneIntegrator()
Fix for handling of --msvc_path in build scripts
Fix for wp.copy() params to record dest and src offset parameters on wp.Tape()
Fix for wp.randn() to ensure return values are finite
Fix for slicing of arrays with gradients in kernels
Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
Fix for handling of bool types in generic kernels
Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details

Assets 9

14 Jun 21:16

c0d1f1ed

v1.2.1

[1.2.1] - 2024-06-14

Fix generic function caching
Fix Warp not being initialized when constructing arrays with wp.array()
Fix wp.is_mempool_access_supported() not resolving the provided device arguments to wp.context.Device

[1.2.0] - 2024-06-06

Add a not-a-number floating-point constant that can be used as wp.NAN or wp.nan.
Add wp.isnan(), wp.isinf(), and wp.isfinite() for scalars, vectors, matrices, etc.
Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by all wp.constant() variables declared in a Warp program.
Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory.
Add runtime checks for wp.MarchingCubes on field dimensions and size
Fix memory leak in wp.Mesh BVH (GH-225)
Use C++17 when building the Warp library and user kernels
Increase PTX target architecture up to sm_75 (from sm_70), enabling Turing ISA features
Extended NanoVDB support (see warp.Volume):
- Add support for data-agnostic index grids, allocation at voxel granularity
- New wp.volume_lookup_index(), wp.volume_sample_index() and generic wp.volume_sample()/wp.volume_lookup()/wp.volume_store() kernel-level functions
- Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
- warp.fem can now work directly on NanoVDB grids using warp.fem.Nanogrid
- Fixed wp.volume_sample_v() and wp.volume_store_*() adjoints
- Prevent wp.volume_store() from overwriting grid background values
Improve validation of user-provided fields and values in warp.fem
Support headless rendering of wp.render.OpenGLRenderer via pyglet.options["headless"] = True
wp.render.RegisteredGLBuffer can fall back to CPU-bound copying if CUDA/OpenGL interop is not available
Clarify terms for external contributions, please see CONTRIBUTING.md for details
Improve performance of wp.sparse.bsr_mm() by ~5x on benchmark problems
Fix for XPBD incorrectly indexing into of joint actuations joint_act arrays
Fix for mass matrix gradients computation in wp.sim.FeatherstoneIntegrator()
Fix for handling of --msvc_path in build scripts
Fix for wp.copy() params to record dest and src offset parameters on wp.Tape()
Fix for wp.randn() to ensure return values are finite
Fix for slicing of arrays with gradients in kernels
Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
Fix for handling of bool types in generic kernels
Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details

Assets 9

07 Jun 03:53

c0d1f1ed

v1.2.0

[1.2.0] - 2024-06-06

Add a not-a-number floating-point constant that can be used as wp.NAN or wp.nan.
Add wp.isnan(), wp.isinf(), and wp.isfinite() for scalars, vectors, matrices, etc.
Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by all wp.constant() variables declared in a Warp program.
Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory.
Add runtime checks for wp.MarchingCubes on field dimensions and size
Fix memory leak in wp.Mesh BVH (GH-225)
Use C++17 when building the Warp library and user kernels
Increase PTX target architecture up to sm_75 (from sm_70), enabling Turing ISA features
Extended NanoVDB support (see warp.Volume):
- Add support for data-agnostic index grids, allocation at voxel granularity
- New wp.volume_lookup_index(), wp.volume_sample_index() and generic wp.volume_sample()/wp.volume_lookup()/wp.volume_store() kernel-level functions
- Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
- warp.fem can now work directly on NanoVDB grids using warp.fem.Nanogrid
- Fixed wp.volume_sample_v() and wp.volume_store_*() adjoints
- Prevent wp.volume_store() from overwriting grid background values
Improve validation of user-provided fields and values in warp.fem
Support headless rendering of wp.render.OpenGLRenderer via pyglet.options["headless"] = True
wp.render.RegisteredGLBuffer can fall back to CPU-bound copying if CUDA/OpenGL interop is not available
Clarify terms for external contributions, please see CONTRIBUTING.md for details
Improve performance of wp.sparse.bsr_mm() by ~5x on benchmark problems
Fix for XPBD incorrectly indexing into of joint actuations joint_act arrays
Fix for mass matrix gradients computation in wp.sim.FeatherstoneIntegrator()
Fix for handling of --msvc_path in build scripts
Fix for wp.copy() params to record dest and src offset parameters on wp.Tape()
Fix for wp.randn() to ensure return values are finite
Fix for slicing of arrays with gradients in kernels
Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
Fix for handling of bool types in generic kernels
Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details

[1.1.1] - 2024-05-24

wp.init() is no longer required to be called explicitly and will be performed on first call to the API
Speed up omni.warp.core's startup time

Assets 9

08 May 15:54

c0d1f1ed

v1.1.0

[1.1.0] - 2024-05-09

Support returning a value from @wp.func_native CUDA functions using type hints
Improved differentiability of the wp.sim.FeatherstoneIntegrator
Fix gradient propagation for rigid body contacts in wp.sim.collide()
Added support for event-based timing, see wp.ScopedTimer()
Added Tape visualization and debugging functions, see wp.Tape.visualize()
Support constructing Warp arrays from objects that define the __cuda_array_interface__ attribute
Support copying a struct to another device, use struct.to(device) to migrate struct arrays
Allow rigid shapes to not have any collisions with other shapes in wp.sim.Model
Change default test behavior to test redundant GPUs (up to 2x)
Test each example in an individual subprocess
Polish and optimize various examples and tests
Allow non-contiguous point arrays to be passed to wp.HashGrid.build()
Upgrade LLVM to 18.1.3 for from-source builds and Linux x86-64 builds
Build DLL source code as C++17 and require GCC 9.4 as a minimum
Array clone, assign, and copy are now differentiable
Use Ruff for formatting and linting
Various documentation improvements (infinity, math constants, etc.)
Improve URDF importer, handle joint armature
Allow builtins.bool to be used in Warp data structures
Use external gradient arrays in backward passes when passed to wp.launch()
Add Conjugate Residual linear solver, see wp.optim.linear.cr()
Fix propagation of gradients on aliased copy of variables in kernels
Facilitate debugging and speed up import warp by eliminating raising any exceptions
Improve support for nested vec/mat assignments in structs
Recommend Python 3.9 or higher, which is required for JAX and soon PyTorch.
Support gradient propagation for indexing sliced multi-dimensional arrays, i.e. a[i][j] vs. a[i, j]
Provide an informative message if setting DLL C-types failed, instructing to try rebuilding the library

[1.0.3] - 2024-04-17

Add a support_level entry to the configuration file of the extensions

Assets 6